27123 

PATENT TRADEMARK OTFlCt 



Docket No. 2976-4044 US1 PATENT APPLICATION 



IN 



THE UNITED STATES PATENT AND TRADEMARK OFFICE 



UTILITY PATENT APPLICATION 



TITLE- NUCLEOTIDE AND AMINO ACID SEQUENCES RELATING TO 
RESPIRATORY DISEASES AND OBESITY 



INVENTOR(S): Tim Keith et al. 



EL 912 004 429 US 



27123 

PATENT TRADEMARK OFFICE 



Docket No. 2976-4044 US1 



EL 912 004 429 US 



NUCLEOTIDE AND AMINO ACID SEQUENCES RELATING TO 
RESPIRATORY DISEASES AND OBESITY 

on ATPn APPLICATIONS 

5 This application is a continuation-in-part of U.S. Application Serial 

Number 60/211.749, filed June 14, 2000, which is incorporated by reference 
in its entirety. 

FIEj D fiP THE INVENTION 

This invention relates to genes identified from human chromosome 
10 I2q23-qter, including Gene 454, Gene 561, and Gene 757, which are 
associated with asthma, obesity, inflammatory bowel disease, and other 
human diseases. The invention also relates to the nucleotide sequences of 
these genes, including genomic DNA sequences, cDNA sequences, and 
single nucleotide polymorphisms. The invention further relates to .solated 
15 nucleic acids comprising these nucleotide sequences, and .solated 
polypeptides or peptides encoded thereby. Also related are express.on 
vectors and host cells comprising the disclosed nucleic acids or fragments 
thereof as well as antibodies that bind to the encoded polypeptides or 
peptides. The present invention further relates to ligands that modulate the 
20 activity of the disclosed genes or gene products. In addition, the invention 
relates to diagnostics and therapeutics for various diseases, .nclud.ng 
asthma, utilizing the disclosed nucleic acids, polypeptides or pept.des, 
antibodies, and/or ligands. 

P pi CC nrsrmPTION OF T1JC «"IIFNCE LISTING 

25 incorporated herein by reference in its entirety is a Sequence L.stmg. 

comprising SEQ ID NO:1 to SEQ ID NO:4687. The Sequence Listing ,s 
contained on a CD-ROM, three copies of which are filed, the Sequence 
Listing being in a computer-readable ASCII file named "Seqlist.bcf. created 
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on June 7, 2001 and of 11,976 kilobyte in size, in IBM-PC Windows®NT 

v4.0 format. 

BACKGROUND 

Asthma has been linked to markers on human chromosome 12 
5 (Wilson et al.. 1998, Genomics, 53: 251-259). In addition, obesity has been 
linked to asthma (Wilson et al.. 1999, Arch. Intern. Med. 159: 2513-14). In 
particular, chromosomal region 12q23-qter has been associated with a 
variety of genetic disorders, including male germ cell tumors, histidinemia, 
growth retardation with deafness and mental retardation, deficiency of Acyl- 
10 CoA dehydrogenase, spinal muscular atrophy, Darier disease, 
cardiomyopathy, Spinocerebellar ataxia-2, brachydactyly, 
Mevalonicaciduria, Hyperimmunoglobulinemia D, Noonan syndrome-1, 
Cardiofaciocutaneous syndrome, spinal muscular atrophy-4, tyrosinemia, 
phenylketonuria, B-cell non-Hodgkin lymphoma, Ulnar-mammary syndrome. 
15 Holt-Oram syndrome, Scapuloperoneal spinal muscular atrophy, alcohol 
intolerance, MODY, Diabetes mellitus, noninsulin-dependent 2, and diabetes 
mellitus insulin-dependent (See National Center for Biotechnology 
Information: http://www.ncbi.nlm.nih.gov/omim/). The genes of this regions 
are also associated with obesity, lung disease, particularly, inflammatory 
20 lung disease phenotypes such as Chronic Obstructive Lung Disease 
(COPD), Adult Respiratory Distress Syndrome (ARDS), and asthma. 
However, few genes in chromosomal region 12q23-qter have been 
discovered. Thus, there is a need in the art for the identification of specific 
genes that are involved in these disorders. Identification and 
25 characterization of such genes will allow the development of effective 
diagnostics and therapeutic means to diagnose, prevent, and/or treat lung 
related disorders, as well as the other diseases described herein. 

SUMMARY OF THE INVENTION 

This invention relates to isolated DNA comprising genes located on 
30 chromosome 12q23-qter (see Table 4). In specific embodiments, the 
invention relates to isolated nucleic acids comprising 12q23-qter genomic 



EL 912 004 429 US 

3 

sequences (e.g., SEQ ID NO:1 to SEQ ID NO:92 and SEQ ID NO: 156 to 
SEQ ID NO: 4973), cDNA and EST sequences (e.g., SEQ ID NO:1 to SEQ 
ID NO:92), BAG sequences (e.g., SEQ ID NO:156 to SEQ !D NO:693). BAC 
clones and contigs (e.g.. SEQ ID NO: 694 to SEQ ID NO: 1265). direct 

5 selected sequences (e.g.. SEQ ID NO: 1266 to SEQ ID NO: 2052), clusters 
(e.g., SEQ ID NO: 2053 to SEQ ID NO: 4973). complementary sequences, 
sequence variants, or fragments thereof, as described herein. The present 
invention also encompasses nucleic acid probes or primers useful for 
assaying a biological sample for the presence or expression of 12q23-qter 

10 genes. 

The invention further encompasses nucleic acids variants comprising 
single nucleotide polymorphisms (SNPs) identified in several 12q23-qter 
genes (Table 10; Figures 7A-7H; Figures 9A-9F; Figures 27A-27K; and 
Figures 28A-28C). These include SNPs for gene 454 (SEQ ID NO: 19; 
15 Figures 7A-7H), gene 561.1 (SEQ ID NO: 31; Figures 27A-27K). gene 561.2 
(SEQ ID NO: 32; Figures 28A-28C), and gene 757 (SEQ ID NO: 90; Figures 
9A-9F). SNPs can be used to diagnose diseases such as asthma, or to 
determine a genetic predisposition thereto. In addition, the present 
invention encompasses nucleic acids comprising alternate splicing variants 
20 (e g., SEQ ID NO:1 to SEQ ID NO:5; SEQ ID NO:17 to SEQ ID NO:18; SEQ 
ID NO:36 to SEQ ID NO:37; SEQ ID NO:43 to SEQ ID NO:44; and SEQ ID 

NO:80 to SEQ IDNO:81). 

This invention also relates to vectors and host cells comprising 
vectors comprising the 12q23-qter nucleic acid sequences disclosed herein. 
25 Such vectors can be used for nucleic acid preparations, including antisense 
nucleic acids, and for the expression of encoded polypeptides or peptides. 
Host cells can be prokaryotic or eukaryotic cells. In specific embodiments, 
an expression vector comprises a DNA sequence encoding the 12q23-qter 
polypeptide sequence (e.g.. SEQ ID NO:93 to SEQ ID NO:155), sequence 
30 variants, or fragments thereof, as described herein. 
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The present invention further relates to isolated 12q23-qter 
polypeptides and peptides. In specific embodiments, the polypeptides or 
peptides comprise the amino acid sequences encoded by the 12q23-qter 
genes (e.g., SEQ ID NO:93 to SEQ ID NO:155), sequence variants, or 
5 portions .thereof, as described herein. In addition, this invention 
encompasses isolated fusion proteins comprising 12q23-qter polypeptides 
or peptides. 

The present invention also relates to isolated antibodies, including 
monoclonal and polyclonal antibodies, and antibody fragments, that are 
10 specifically reactive with the 12q23-qter polypeptides, fusion proteins, or 
variants, or portions thereof, as disclosed herein. In specific embodiments, 
monoclonal antibodies are prepared to be specifically reactive with a 12q23- 
qter polypeptide (e.g., SEQ ID NO:93 to SEQ ID NO:155) or peptides, or 
sequence variants thereof. 
15 In addition, the present invention relates to methods of obtaining 

12q23-qter polynucleotides and polypeptides, variant sequences, or 
fragments thereof, as disclosed herein. Also related are methods of 
obtaining antibodies and antibody fragments that bind to 12q23-qter 
polypeptides, variant sequences, or fragments thereof. The present 
20 invention also encompasses methods of obtaining 12q23-qter ligands, e.g., 
agonists, antagonists, inhibitors, and binding factors. Such ligands can be 
used as therapeutics for asthma and related diseases. 

The present invention also relates to diagnostic methods and kits 
utilizing obtaining 12q23-qter (wild-type, mutant, or variant) nucleic acids, 
25 polypeptides, antibodies, or functional fragments thereof. Such factors can 
be used, for example, in diagnostic methods and kits for measuring 
expression levels of obtaining 12q23-qter gene expression, and to screen 
for various obtaining 12q23-qter-related diseases, especially asthma. In 
addition, the nucleic acids described herein can be used to identify 
30 chromosomal abnormalities affecting 12q23-qter genes, and to identify 
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allelic variants or mutations of 12q23-qte genes in an individual or 
population. 

The present invention further relates to methods and therapeutics for 
the treatment of various diseases, including asthma. In various 
5 embodiments, therapeutics comprising the disclosed 12q23-qter nucleic 
acids, polypeptides, antibodies, ligands, or variants, derivatives, or portions 
thereof, are administered to a subject to treat, prevent, or ameliorate 
asthma. Specifically related are therapeutics comprising 12q23-qter 
antisense nucleic acids, monoclonal antibodies, and gene therapy vectors. 
10 Such therapeutics can be administered alone, or in combination with one or 
more asthma treatments. 

In addition, this invention relates to non-human transgenic animals 
and cell lines comprising one or more of the disclosed 12q23-qter nucleic 
acids, which can be used for drug screening, protein production, and other 
15 purposes. Also related are non-human knock-out animals and cell lines, 
wherein one or more endogenous 12q23-qter genes (i.e., orthologs). or 
portions thereof, are deleted or replaced by marker genes. 

This invention further relates to methods of identifying proteins that 
are candidates for being involved in asthma (i.e.. a "candidate protein"). 
20 Such proteins are identified by a method comprising: 1 ) identifying a protein 
in a first individual having the asthma phenotype; 2) identifying a protein in a 
second individual not having the asthma phenotype; and 3) comparing the 
protein of the first individual to the protein of the second individual, wherein 
a) the protein that is present in the second individual but not the first 
25 individual is the candidate protein; or b) the protein that is present in a 
higher amount in the second individual than in the first individual is the 
candidate protein; or c) the protein that is present in a lower amount in the 
second individual than in the first individual is the candidate protein. 
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RRIEF DESCRIPTION OF THE FIGURES 

Figures 1A-1D show the plot of multipoint LOD score against the 
map location of the markers along chromosome 12 for four phenotypes: 
asthma, bronchial hyper-responsiveness, total IgE, and specific IgE. 
5 Figures 2A-2P show genes mapped to the 12q23-qter interval 

determined from information that is curated by the National Center for 
Biotechnology Information, "NCBI" (http://www.ncbi.nlm.nih.gov/genemap/). 
This particular information contains genes mapped against the Gene Bridge 
(GB) 4 panel. 

10 Figures 3A-3G show genes mapped to the 12q23-qter interval 

determined from information that is curated by NCBI 
(http://www.ncbi.nlm.nih.gov/genemap/). This particular information 
contains genes mapped against the Gene Bridge (GB) 3 panel. 

Figure 4 shows the integration of the Marshfield Center for Medical 
15 Genetics (http://www.marshmed.org/genetics/) genetic map with 
GeneMap99 from NCBI. The regions of study mentioned above are 
indicated at the top of the figure. 

Figures 5A-5I show the BAC/STS content contig map for 

chromosome 12. 

20 Figures 6A-6U show the results of Northern blot analysis of the 

Genes of 12q23-qter in various tissues. 

Figures 7A-7H show the cDNA sequence and amino acid sequence 
of Gene 454 with the corresponding SNPs underlined. 

Figure 8 shows the results of RT-PCR analysis of Gene 561.1 and 

25 Gene 561.2. 

Figures 9A-9F show the cDNA sequence and amino acid sequence 
of Gene 757 with the corresponding SNPs underlined. 

Figure 10 shows the domain structure of Gene 454 and the exon 
location of the corresponding SNPs. 
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Figure 11 shows the significance (-logio(p-value)) for the comparison 
of SNP allele frequencies in cases (asthma) and controls in the combined 
population against the relative location (Kb) of SNPs along chromosome 12. 

Figure 12 shows the significance (-log 10 (p-value)) for the comparison 
5 of SNP allele frequencies in cases (asthma) and controls in the US and UK 
populations against the relative location (Kb) of SNPs along chromosome 
12. 

Figure 13 shows the significance (-logio(p-value)) for the comparison 
of SNP allele frequencies in cases (BHR(PC 2 o^16mg/ml) and asthma) and 
10 controls in the combined population against the relative location (Kb) of 
SNPs along chromosome 12. 

Figure 14 shows the significance (-logio(p-value)) for the comparison 
of SNP allele frequencies in cases (BHR (PC 2 o<16mg/ml) and asthma) and 
controls in the US and UK populations against the relative location (Kb) of 
15 SNPs along chromosome 12. 

Figure 15 shows the significance (-logio(p-value)) for the comparison 
of SNP allele frequencies in cases (total IgE and asthma) and controls in the 
combined population against the relative location (Kb) of SNPs along 
chromosome 12. 

20 Figure 16 shows the significance (-log-io(p-value)) for the comparison 

of SNP allele frequencies in cases (total IgE and asthma) and controls in the 
US and UK populations against the relative location (Kb) of SNPs along 
chromosome 12. 



25 of SNP allele frequencies in cases (specific IgE and asthma) and controls in 
the combined population against the relative location (Kb) of SNPs along 
chromosome 12. 

Figure 18 shows the significance (-logio(p-value)) for the comparison 
of SNP allele frequencies in cases (specific IgE and asthma) and controls in 
30 the US and UK populations against the relative location (Kb) of SNPs along 
chromosome 12. 



Figure 17 shows the significance (-log-io(p-value)) for the comparison 
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Figure 19 shows the significance (-logio(p-value)) for the comparison 
of haplotype frequencies (2-SNP-at-a-time) in cases (asthma) and controls 
in the combined population against the relative location (Kb) of SNPs along 
chromosome 12. 

5 Figure 20 shows the significance (-log-io(p-value)) for the comparison 

of haplotype frequencies (2-SNP-at-a-time) in cases (asthma) and controls 
in the US and UK populations against the relative location (Kb) of SNPs 
along chromosome 12. 

Figure 21 shows the significance (-log-io(p-value)) for the comparison 
10 of haplotype frequencies (2-SNP-at-a-time) in cases (BHR (PC 2 o^16mg/ml) 
and asthma) and controls in the combined population against the relative 
location (Kb) of SNPs along chromosome 12. 

Figure 22 shows the significance (-log-io(p-value)) for the comparison 
of haplotype frequencies (2-SNP-at-a-time) in cases (BHR (PC 2 o^16mg/ml) 
15 and asthma) and controls in the US and UK populations against the relative 
location (Kb) of SNPs along chromosome 12. 

Figure 23 shows the significance (-logio(p-value)) for the comparison 
of haplotype frequencies (2-SNP-at-a-time) in cases (total IgE and asthma) 
and controls in the combined population against the relative location (Kb) of 
20 SNPs along chromosome 12. 

Figure 24 shows the significance (-log-io(p-value)) for the comparison 
of haplotype frequencies (2-SNP-at-a-time) in cases (total IgE and asthma) 
and controls in the US and UK populations against the relative location (Kb) 
of SNPs along chromosome 12. 
25 Figure 25 shows the significance (-log 10 (p-value)) for the comparison 

of haplotype frequencies (2-SNP-at-a-time) in cases (specific IgE and 
asthma) and controls in the combined population against the relative 
location (Kb) of SNPs along chromosome 12. 

Figure 26 shows the significance (-iog-io(p-value)) for the comparison 
30 of haplotype frequencies (2-SNP-at-a-time) in cases (specific IgE and 
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asthma) and controls in the US and UK populations against the relative 
location (Kb) of SNPs along chromosome 12. 

Figures 27A-27K show the cDNA sequence and amino acid 
sequence of Gene 561.1 with the corresponding SNPs underlined. 
5 Figures 28A-28C show the cDNA sequence and amino acid 

sequence of Gene 561.2 with the corresponding SNPs underlined. 

DETAILED DESCRIPTION OF THE INVENTION 

Chromosome 12q23-qter genes were isolated by narrowly defining 
the region of chromosome 12q23-qter that showed association with asthma. 

10 Chromosome 12q23-qter genes have been implicated in other diseases, 
including obesity. Bronchial asthma, furthermore, has been linked to 
intestinal conditions such as inflammatory bowel disease (B. Wallaert et al., 
1995, J. Exp. Med. 182:1897-1904). Thus, there was a need to identify and 
isolate the gene(s) associated with this region of human chromosome 12. 

15 To aid in the understanding of the specification and claims, the 

following definitions are provided. 

DEFINITIONS 

"Disorder region" refers to a portion of the human chromosome 12 
bounded by the markers D12S2070 to the 12q telomere. A "disorder- 

20 associated" nucleic acid or "disorder-associated" polypeptide sequence 
refers to a nucleic acid sequence that maps to region 12q23-qter and 
polypeptides encoded thereby. For nucleic acid sequences, this 
encompasses sequences that are homologous or complementary to the 
reference sequence, as well as "sequence-conservative variants" and 

25 "function-conservative variants." For polypeptide sequences, this 
encompasses "function-conservative variants." Also encompassed are 
naturally-occurring mutations associated with respiratory diseases including, 
but not limited to, asthma and atopy, as well as other diseases arising from 
mutations in this region including those described in detail herein. These 

30 mutations are not limited to mutations that cause inappropriate expression 
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(e.g., lack of expression, over-expression, and expression in an 
inappropriate tissue type). 

"Sequence-conservative" variants are those in which a change of one 
or more nucleotides in a given codon position results in no alteration in the 
5 amino acid encoded at that position (i.e., silent mutations). "Function- 
conservative" variants are those in which a change in one or more 
nucleotides in a given codon position results in a polypeptide sequence in 
which a given amino acid residue in a polypeptide has been changed 
without substantially altering the overall conformation and function of the 
10 native polypeptide, including, but not limited to, replacement of an amino 
acid with one having similar physico-chemical properties (such as, for 
example, acidic, basic, hydrophobic, and the like). "Function-conservative" 
variants also include analogs of a given polypeptide and any polypeptides 
that have the ability to elicit antibodies specific to a designated polypeptide. 
15 "Nucleic acid or "polynucleotide" as used herein refers to purine- and 

pyrimidine-containing polymers of any length, either polyribonucleotides or 
polydeoxyribonucleotide or mixed polyribo-polydeoxyribonucleotides. This 
includes single-and double-stranded molecules, i.e., DNA-DNA, DNA-RNA 
and RNA-RNA hybrids, as well as "protein nucleic acids" (PNA) formed by 
20 conjugating bases to an amino acid backbone. This also includes nucleic 
acids containing modified bases. 

A "coding sequence" or a "protein-coding sequence" is a 
polynucleotide sequence capable of being transcribed into mRNA and/or 
capable of being translated into a polypeptide. The boundaries of the 
25 coding sequence are typically determined by a translation start codon at the 
5'-terminus and a translation stop codon at the 3'-terminus. 

As used herein, the "reference sequence" refers to the sequence 
used to compare individuals in identifying single nucleotide polymorphisms 
and the like. "Variant" sequences refer to nucleotide sequences (and in 
30 some cases, the encoded amino acid sequences) that differ from the 
reference sequence(s) at one or more positions. Non-limiting examples of 
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variant sequences include the disclosed single nucleotide polymorphisms 
(SNPs), alternate splice variants, and the amino acid sequences encoded by 
these variants. 

"Expressed Sequence Tag (EST)" is a nucleic acid that encodes for a 
5 portion of or a full-length protein sequence. 

"12q23-qter genes" and "12q23-qter nucleic acids" include the genes 
and ESTs shown in Figures 2A-2P and Figures 3A-3G, as well as the 
sequences listed in Table 4 (i.e., Gene 214, Gene 215, Gene 224, Gene 
266, Gene 283, Gene 292, Gene 298, Gene 321, Gene 399, Gene 422, 
10 Gene 436, Gene 454, Gene 515, Gene 536, Gene 543, Gene 548, Gene 
549, Gene 550, Gene 551, Gene 553, Gene 555, Gene 558, Gene 559, 
Gene 561 , Gene 562, Gene 563, Gene 564, Gene 566, Gene 567, Gene 
570, Gene 571, Gene 572, Gene 575, Gene 577, Gene 579, Gene 580, 
Gene 581, Gene 583, Gene 584, Gene 586, Gene 587, Gene 589, Gene 
15 590, Gene 592, Gene 593, Gene 594, Gene 595, Gene 596, Gene 601, 
Gene 603, Gene 604, Gene 605, Gene 606, Gene 608, Gene 611. Gene 
615, Gene 617, Gene 618, Gene 620, Gene 621, Gene 622. Gene 690. 
Gene 692, Gene 693, Gene 694, Gene 695, Gene 697, Gene 698. Gene 
699, Gene 702. Gene 705, Gene 707, Gene 722, Gene 748, Gene 749. 
20 Gene 751, Gene 752, Gene 753, Gene 754, Gene 756, Gene 757, Gene 
835, and Gene 848). 

"12q23q-qter proteins" and "12q23q-qter polypeptides" include the 
polypeptide sequences encoded by the genes listed in Table 4. 

A "complement" of a nucleic acid sequence as used herein refers to 
25 the "antisense^ sequence that participates in Watson-Crick base-pairing with 
the original sequence. 

A "probe" refers to a nucleic acid or oligonucleotide that forms a 
hybrid structure with a sequence in a target region due to complementarity 
of at least one sequence in the probe with a sequence in the target region. 
30 Nucleic acids are "hybrid izable" to each other when at least one 

strand of nucleic acid can anneal to another nucleic acid strand under 
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defined stringency conditions. As is well known in the art, stringency of 
hybridization is determined, e.g., by (a) the temperature at which 
hybridization and/or washing is performed, and (b) the ionic strength and 
polarity (e.g., formamide) of the hybridization and washing solutions, as well 

5 as other parameters. Hybridization requires that the two nucleic acids 
contain substantially complementary sequences; depending on the 
stringency of hybridization, however, mismatches may be tolerated. The 
appropriate stringency for hybridizing nucleic acids depends on the length of 
the nucleic acids and the degree of complementariiy, variables well known in 

10 the art. 

"Gene" refers to a DNA sequence that encodes through its template 
or messenger RNA a sequence of amino acids characteristic of a specific 
peptide, polypeptide, or protein. The term "gene" as used herein with 
reference to genomic DNA includes intervening, non-coding regions, as well 
15 as regulatory regions, and can include 5* and 3* ends. 

"Gene sequence" refers to a DNA molecule, including a DNA 
molecule that contains a non-transcribed or non-translated sequence. The 
term is also intended to include any combination of gene(s), gene 
fragment(s), non-transcribed sequence(s), or non-translated sequence(s) 
20 that are present on the same DNA molecule. 

A gene sequence is "wild-type" if such sequence is usually found in 
individuals unaffected by the disease or condition of interest. However, 
environmental factors and other genes can also play an important role in the 
ultimate determination of the disease. In the context of complex diseases 
25 involving multiple genes ("oligogenic disease"), the "wild type", or normal 
sequence can also be associated with a measurable risk or susceptibility, 
receiving its reference status based on its frequency in the general 
population. As used herein, "wild-type" refers to the reference sequence. 
The wild-type sequences are used to identify the variants (single nucleotide 
30 polymorphisms) described in detail herein. 
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A gene sequence is a "mutant" sequence if it differs from the wild- 
type sequence. For example, a Gene 454 nucleic acid containing a single 
nucleotide polymorphism is a mutant sequence. In some cases, the 
individual carrying such gene has increased susceptibility toward the 
5 disease or condition of interest. In other cases, the "mutant" sequence 
might also refer to a sequence that decreases the susceptibilty toward a 
disease or condition of interest, and thus acting in a protective manner. Also 
a gene is a "mutant" gene if too much ("overexpressed") or too little 
("underexpressed") of such gene is expressed in the tissues in which such 
10 gene is normally expressed, thereby causing the disease or condition of 
interest. 

"cDNA" refers to complementary or copy DNA produced from an RNA 
template by the action of RNA-dependent DNA polymerase (reverse 
transcriptase). Thus, a "cDNA clone" means a duplex DNA sequence 

1 5 complementary to an RNA molecule of interest, carried in a cloning vector or 
PCR amplified. This term includes genes from which the intervening 
sequences have been removed. 

"Recombinant DNA" means a molecule that has been recombined by 
in vitro splicing/and includes cDNA or a. genomic DNA sequence. 

20 "Cloning" refers to the use of in vitro recombination techniques to 

insert a particular gene or other DNA sequence into a vector molecule. In 
order to successfully clone a desired gene, it is necessary to use methods 
for generating DNA fragments, for joining the fragments to vector molecules, 
for introducing the composite DNA molecule into a host cell in which it can 

25 replicate, and for selecting the clone having the target gene from amongst 
the recipient host cells. 

"cDNA library" refers to a collection of recombinant DNA molecules 
containing cDNA inserts, which together comprise the entire genome of an 
organism. Such a cDNA library can be prepared by methods known to one 

30 skilled in the art and described by, for example, Cowell and Austin, 1997, 
"cDNA Library Protocols," Methods in Molecular Biology. Generally, RNA is 
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first isolated from the cells of an organism from whose genome it is desired 
to clone a particular gene. 

The term "vector" as used herein refers to a nucleic acid molecule 
capable of replicating another nucleic acid to which it has been linked. A 
5 vector, for example, can be a plasmid. 

"Cloning vector" refers to a plasmid or phage DNA or other DNA 
sequence that is able to replicate in a host cell. The cloning vector is 
characterized by one or more endonuclease recognition sites at which such 
DNA sequences may be cut in a determinable fashion without loss of an 
10 essential biological function of the DNA, which may contain a marker 
suitable for use in the identification of transformed cells. 

"Expression vector" refers to a vehicle or vector similar to a cloning 
vector but which is capable of expressing a nucleic acid sequence that has 
been cloned into it, after transformation into a host. A nucleic acid sequence 
15 is "expressed" when it is transcribed to yield an mRNA sequence. In most 
cases, this transcript will be translated to yield amino acid sequence. The 
cloned gene is usually placed under the control of (i.e., operably linked to) 
an expression control sequence. 

"Expression control sequence" or "regulatory sequence" refers to a 
20 nucleotide sequence that controls or regulates expression of structural 
genes when operably linked to those genes. These include, for example, 
the lac systems, the tip system, major operator and promoter regions of the 
phage lambda, the control region of fd coat protein and other sequences 
known to control the expression of genes in prokaryotic or eukaryotic cells. 
25 Expression control sequences will vary depending on whether the vector is 
designed to express the operably linked gene in a prokaryotic or eukaryotic 
host, and may contain transcriptional elements such as enhancer elements, 
termination sequences, tissue-specificity elements and/or translattonal 
initiation and termination sites. 
30 "Operably linked" means that the promoter controls the initiation of 

expression of the gene. A promoter is operably linked to a sequence of 



15 

proximal DNA if upon introduction into a host cell the promoter determines 
the transcription of the proximal DNA sequence(s) into one or more species 
of RNA. A promoter is operably linked to a DNA sequence if the promoter is 
capable of initiating transcription of that DNA sequence. 
5 "Host" includes prokaryotes and eukaryotes. The term includes an 

organism or cell that is the recipient of a replicable expression vector. 

The introduction of the nucleic acids into the host cell by any method 
known in the art, including those described herein, will be referred to herein 
as "transformation." The cells into which have been introduced nucleic acids 
1 0 described above are meant to also include the progeny of such cells. 

"Amplification of nucleic acids" refers to methods such as polymerase 
chain reaction (PCR), ligation amplification (or ligase chain reaction, LCR) 
and amplification methods based on the use of Q-beta replicase. These 
methods are well known in the art and described, for example, in U.S. 
15 Patent Nos. 4,683,195 and 4,683,202. Reagents and hardware for 
conducting PCR are commercially available. Primers useful for amplifying 
sequences from the disorder region are preferably complementary to, and 
preferably hybridize specifically to, sequences in the 12q23-qter region or in 
regions that flank a target region therein. Chromosome 12q23-qter genes 
20 generated by amplification may be sequenced directly. Alternatively, the 
amplified sequence(s) may be cloned prior to sequence analysis. 

A nucleic acid or fragment thereof is "substantially homologous" or 
"substantially similar" to another if, when optimally aligned (with appropriate 
nucleotide insertions and/or deletions) with the other nucleic acid (or its 
25 complementary strand), there is nucleotide sequence identity in at least 60% 
of the nucleotide bases, usually at least 70%, more usually at least 80%. 
preferably at least 90%, and more preferably at least 95-98% of the 
nucleotide bases. 

Alternatively, substantial homology or similarity exists when a nucleic 
30 acid or fragment thereof will hybridize, under selective hybridization 
conditions, to another nucleic acid (or a complementary strand thereof). 
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Selectivity of hybridization exists when hybridization which is substantially 
more selective than total lack of specificity occurs. Typically, selective 
hybridization will occur when there is at least 55% homology over a stretch 
of at least nine or more nucleotides, preferably at least 65%, more 
preferably at least 75%, and most preferably at least 90% (see, M. 
Kanehisa, 1984, Nucl. Acids Res. 11:203-213). The length of homology 
comparison, as described, may be over longer stretches, and in certain 
embodiments will often be over a stretch of at least 14 nucleotides, usually 
at least 20 nucleotides, more usually at least 24 nucleotides, typically at 
least 28 nucleotides, more typically at least 32 nucleotides, and preferably at 
least 36 or more nucleotides. 

Nucleic acids referred to herein as "isolated" are nucleic acids 
separated away from the nucleic acids of the genomic DNA or cellular RNA 
of their source of origin (e.g., as it exists in cells or in a mixture of nucleic 
acids such as a library), and may have undergone further processing. 
"Isolated", as used herein, refers to nucleic or amino acid sequences that 
are at least 60% free, preferably 75% free, and most preferably 90% free 
from other components with which they are naturally associated. "Isolated- 
nucleic acids (polynucleotides) include nucleic acids obtained by methods 
described herein, similar methods or other suitable methods, including 
essentially pure nucleic acids, nucleic acids produced by chemical 
synthesis, by combinations of biological and chemical methods, and 
recombinant nucleic acids which are isolated. Nucleic acids referred to 
herein as "recombinant" are nucleic acids which have been produced by 
recombinant DNA methodology, including those nucleic acids that are 
generated by procedures which rely upon a method of artificial replication, 
such as the polymerase chain reaction (PCR) and/or cloning into a vector 
using restriction enzymes. "Recombinant" nucleic acids are also those that 
result from recombination events that occur through the natural mechanisms 
of cells, but are selected for after the introduction to the cells of nucleic acids 
designed to allow or make probable a desired recombination event. 
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Portions of the isolated nucleic acids which code for polypeptides having a 
certain function can be identified and isolated by, for example, the method of 
Jasin, M., et al., U.S. Patent No. 4,952,501. 

In the context of this invention, the term "oligonucleotide" refers to 
5 naturally-occurring species or synthetic species formed from naturally- 
occurring subunits or their close homologs. The term may also refer to 
moieties that function similarly to oligonucleotides, but have non-natu rally- 
occurring portions. Thus, oligonucleotides may have altered sugar moieties 
or inter-sugar linkages. Exemplary among these are phosphorothioate and 
10 other sulfur containing species which are known in the art. 

As used herein, the terms "protein" and "polypeptide" are 
synonymous. "Peptides" are defined as fragments or portions of 
polypeptides, preferably fragments or portions having at least one functional 
activity (e.g., proteolysis, adhesion, fusion, antigenic, or intracellular activity) 
1 5 as the complete polypeptide sequence. 

As used herein, "isolated" proteins or polypeptides are proteins or 
polypeptides purified to a state beyond that in which they exist in cells. In a 
preferred embodiment, they are at least 10% pure; i.e., most preferably they 
are substantially purified to 80 or 90% purity. "Isolated" proteins or 
20 polypeptides include proteins or polypeptides obtained by methods 
described infra, similar methods or other suitable methods, and include 
essentially pure proteins or polypeptides, proteins or polypeptides produced 
by chemical synthesis or by combinations of biological and chemical 
methods, and recombinant proteins or polypeptides which are isolated. 
25 Proteins or polypeptides referred to herein as "recombinant" are proteins or 
polypeptides produced by the expression of recombinant nucleic acids. 

A "portion" as used herein with regard to a protein or polypeptide, 
refers to fragments of that protein or polypeptide. The fragments can range 
in size from 5 amino acid residues to all but one residue of the entire protein 
sequence. Thus, a portion or fragment can be at least 5, 5-50. 50-100, 100- 
200, 200-400. 400-800, or more consecutive amino acid residues of a 
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chromosome 12q23-qter protein or polypeptide, for example, SEQ ID NO:93 
to SEQ ID NO:155, or variants thereof. 

The term "immunogenic", refers to the ability of a molecule (e.g., a 
polypeptide or peptide) to elicit a humoral and/or cellular immune response 

5 in a host animal. 

The term "antigenic" refers to the ability of a molecule (e.g., a 
polypeptide or peptide) to bind to its specific antibody with sufficiently high 
affinity to form a detectable antigen-antibody complex. 

"Antibodies" refer to polyclonal and/or monoclonal antibodies and 
10 fragments thereof, and immunologic binding equivalents thereof, that can 
bind to asthma proteins and fragments thereof or to nucleic acid sequences 
from the 12q23-qter region, particularly from the asthma locus or a portion 
thereof. The term antibody is used both to refer to a homogeneous 
molecular entity, or a mixture such as a serum product made up of a 
1 5 plurality of different molecular entities. 

The term "monoclonal antibody" or "monoclonal antibody 
composition", as used herein, refers to a population of antibody molecules 
that contain only one species of an antigen binding site capable of 
immunoreacting with a particular epitope of a 12q23-qter polypeptide or 
20 peptide. A monoclonal antibody composition thus typically displays a single 
binding affinity for a particular 12q23-qter polypeptide or peptide with which 
it immunoreacts. 

The term "ligand" as used herein describes any molecule, protein, 
peptide, or compound with the capability of directly or indirectly altering the 
25 physiological function, stability, or levels of a polypeptide. 

A "sample" as used herein refers to a biological sample, such 
as, for example, tissue or fluid isolated from an individual (including, without 
limitation, plasma, serum, cerebrospinal fluid, lymph, tears, saliva, milk, pus. 
and tissue exudates and secretions) or from in vitro cell culture constituents. 
30 as well as samples obtained from, for example, a laboratory procedure. 
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As used herein, the term "ortholog" denotes a gene or polypeptide 
obtained from one species that has homology to an analogous gene or 
polypeptide from a different species. This is in contrast to "parclog", which 
denotes a gene or polypeptide obtained from a given species that has 
5 homology to a distinct gene or polypeptide from that same species. 

Standard reference works setting forth the general principles of 
recombinant DNA technology include J. Sambrook et al., 1989, Molecular 
Cloning: A Laboratory Manual, 2d Ed., Cold Spring Harbor Laboratory 
Press, Cold Spring Harbor, NY; P.B. Kaufman et al.. (eds), 1995. Handbook 
10 of Molecular and Cellular Methods in Biology and Medicine, CRC Press, 
Boca Raton; M.J. McPherson (ed), 1991, Directed Mutagenesis: A Practical 
Approach, IRL Press, Oxford; J. Jones, 1992, Amino Acid and Peptide 
Synthesis, Oxford Science Publications, Oxford; B.M. Austen and O.M.R. 
Westwood, 1991, Protein Targeting and Secretion, IRL Press, Oxford; D.N 
15 Glover (ed), 1985, DNA Cloning, Volumes I and II; M.J. Gait (ed), 1984, 
Oligonucleotide Synthesis; B.D. Hames and S.J. Higgins (eds), 1984, 
Nucleic Acid Hybridization; Wu and Grossman (eds), Methods in 
Enzymoloqy (Academic Press, Inc.). Vol. 154 and Vol. 155; Quirke and 
Taylor (eds). 1991. PCR-A Practical Approach; Hames and Higgins (eds). 
20 1984. Transcription and Translation; R.I. Freshney (ed), 1986. Animal Cell 
Culture; immobilized Cells and Enzymes, 1986. IRL Press; Perbal, 1984, A 
Practical Guide to Molecular Cloning; J. H. Miller and M. P. Calos (eds). 
1987, Gene Transfer Vectors for Mammalian Cells, Cold Spring Harbor 
Laboratory Press; M.J. Bishop (ed), 1998, Guide to Human Genome 
25 Computing, 2d Ed., Academic Press, San Diego, CA; L.F. Peruski and A.H. 
Peruski, 1997, The Internet and the New Biology: Tools for Genomic and 
Molecular Research, American Society for Microbiology, Washington, D.C. 

Standard reference works setting forth the general principles of 
immunology include S. Sell, 1996, Immunology, Immunopathology & 
Immunity, 5th Ed., Appleton & Lange, Publ., Stamford, CT; D. Male et al., 
1996, Advanced Immunology, 3d Ed., Times Mirror Int'l Publishers Ltd., 
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Publ., London; DP. Stites and A.I. Terr, 1991, Basic and Clinical 
Immunology. 7th Ed., Appleton & Lange, Publ., Norwalk, CT; and A.K. 
Abbas et al., 1991, Cellular and Molecular Immunology, W. B. Saunders 
Co., Publ., Philadelphia, PA. Any suitable materials and/or methods known 
5 to those of skill can be utilized in carrying out the present invention; 
however, preferred materials and/or methods are described. Materials, 
reagents, and the like to which reference is made in the following description 
and examples are generally obtainable from commercial sources, and 
specific vendors are cited herein. 

10 NUCLEIC ACIDS 

The present invention relates to nucleic acids from chromosome 
12q23-qter genes (Table 4; e.g., SEQ ID NO: 1 to SEQ ID NO:92, genomic 
DNA within BAC end sequences (e.g., SEQ ID NO:156 to SEQ ID NO:693). 
and genomic DNA of BAC sequences (e.g., SEQ ID NO:694 to SEQ ID 
15 NO:979), direct selected sequences (e.g., SEQ ID NO:980 to SEQ ID 
NO:1766), clusters (e.g., SEQ ID N0.1767 to SEQ ID NO:4687). RNA, 
fragments of the genomic. cDNA, or RNA nucleic acids comprising 20, 40, 
60, 100, 200, 500 or more contiguous nucleotides, and the complements 
thereof. Closely related variants are also included as part of this invention, 
20 as well as recombinant nucleic acids comprising at least 50, 60. 70. 80. or 
90% of the nucleic acids described above which would be identical to 
nucleic acids from chromosome 12q23-qter genes except for one or a few 
substitutions, deletions, or additions. 

Further, the nucleic acids of this invention include the adjacent 
25 chromosomal regions of chromosome 12q23-qter genes required for 
accurate expression of the respective gene. In a preferred embodiment, the 
present invention is directed to at least 15 contiguous nucleotides of the 
nucleic acid sequence of any of SEQ ID NO:1 to SEQ ID NO:92 and SEQ ID 
NO:156 to SEQ ID NO:4687. More particularly, embodiments of this 
30 invention include the BAC clones containing segments of chromosome 
12q23-qter genes including RPCI-11_0899A17, RPCI-11_0666B20, RPCI- 
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11_0723P10, RPCI-11_0831E18, RPCI-11_0932D22, and RPCI- 
11 0702C13. A preferred embodiment is the nucleotide sequence of the 
BAC clones consisting of SEQ ID NO:694 to SEQ ID NO:979 and those 
listed in Table 3. Another embodiment is the nucleotide sequence of the 
BAC end sequences of SEQ ID NO:156 to SEQ ID NO:693. 

The invention also relates to direct selected clones and ESTs from 
the 12q23-qter (e.g., SEQ ID NO:1 to SEQ ID NO:92). In a preferred 
embodiment, the invention relates to clusters of nucleic acids combining the 
direct selected clones with ESTs homologous to the BAC sequences and 
BAC end sequences (SEQ ID NO:1675 to SEQ ID NO:4594). 

The invention also concerns the use of the nucleotide sequence of 
the nucleic acids of this invention to identify DNA probes for genes of 12q23- 
qter (SEQ ID NO:1 to SEQ ID NO:92), BAC end sequences (SEQ ID 
NO:156 to SEQ ID NO:693), BACs (SEQ ID NO:694 to SEQ ID NO:979). 
direct selected clones (SEQ ID NO:980 to SEQ ID NO:1766), and sequence 
clusters (SEQ ID NO:1767 to SEQ ID NO:4687), PCR primers to amplify the 
genes of 12q23-qter, nucleotide polymorphisms (Table 10), and regulatory 
elements of the genes of 12q23-qter. 

This invention further relates to methods of using isolated and/or 
recombinant 12q23-qter nucleic acids (DNA or RNA) that are characterized 
by their ability to hybridize to (a) a nucleic acid encoding a protein or 
polypeptide, such as a nucleic acid having any of the sequences SEQ ID 
NO:1 to SEQ ID NO:92 and SEQ ID NO:156 to SEQ ID NO:4687, or (b) a 
fragment of the foregoing (e.g.. any of the nucleotide sequences set forth in 
Tables 8, 9, 11A and 11B). For example, a fragment can comprise the 
minimum nucleotides of a chromosome 12q23-qter protein required to 
encode a functional chromosome 12q23-qter protein, or the minimum 
nucleotides to encode a polypeptide having the amino acid sequence of 
SEQ ID NO:93 to SEQ ID NO:155, or to encode a functional equivalent 
thereof. A functional equivalent can include a polypeptide, which, when 
incorporated into a cell, has all or part of the activity of a chromosome 
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12q23-qter protein. A functional equivalent of a chromosome 12q23-qter 
protein, therefore, would have a similar amino acid sequence (at least 65% 
sequence identity) and similar characteristics to, or perform in substantially 
the same way as a chromosome 12q23-qter protein. A nucleic acid which 
5 hybridizes to a nucleic acid encoding a chromosome 12q23-qter protein or 
polypeptide, such as SEQ ID NO:93 to SEQ ID NO:155, can be double- or 
single-stranded. Hybridization to DNA, such as DNA having a sequence set 
forth in SEQ ID NO:1 to SEQ ID NO:92. SEQ ID NO:156 to SEQ ID 
NO:4687, Tables 8. 9. 11A, and 11B, includes hybridization to the strand 
1 0 shown, or to the complementary strand. 

The sequences of the present invention may be derived from a 
variety of sources including DNA, cDNA, synthetic DNA, synthetic RNA. or 
combinations thereof. Such sequences may comprise genomic DNA, which 
may or may not include naturally occurring introns. Moreover, such genomic 
15 DNA may be obtained in association with promoter regions or poly (A) 
sequences. The sequences, genomic DNA, or cDNA may be obtained in 
any of several ways. Genomic DNA can be extracted and purified from 
suitable cells by means well known in the art. Alternatively, mRNA can be 
isolated from a cell and used to produce cDNA by reverse transcription or 

20 other means. 

The present invention also relates to nucleic acids that encode a 
polypeptide having the amino acid sequence of any one of SEQ ID NO:93 to 
SEQ ID NO:155, or functional equivalents thereof. Afunctional equivalent of 
a 12q23-qter protein includes fragments or variants that perform at least on 
25 characteristic function of the 12q23-qter protein (e.g.. antigenic or 
intracellular activity). Preferably, a functional equivalent will share at least 
65% sequence identity with the 12q23-qter polypeptide. 

Sequence identity calculations can be performed using computer 
programs, hybridization methods, or calculations. Preferred computer 
program methods to determine identity and similarity between two 
sequences include, but are not limited to, the GCG program package. 
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BLASTN, BLASTX, TBLASTX, and FASTA (J. Devereux et al., 1984. 
Nucleic Acids Research 12(1):387; S.F. Altschul et al., 1990, J. Mo/ec. Biol. 
215:403-410; W. Gish and DJ. States, 1994, Nature Genet. 3:266-272; 
W.R. Pearson and DJ. Lipman, 1988. Proc Natl. Acad. Sci. USA 
5 85(8):2444-8). The BLAST programs are publicly available from NCBl and 
other sources . The well-known Smith Waterman algorithm may also be 
used to determine identity. 

For example, nucleotide sequence identity can be determined by 
comparing a query sequences to sequences in publicly available sequence 
10 databases (NCBl) using the BLASTN2 algorithm (S.F. Altschul et al.. 1997. 
Nucl. Acids Res., 25:3389-3402). The parameters for a typical search are: 
E = 0.05, v = 50, B = 50, wherein E is the expected probability score cutoff, 
V is the number of database entries returned in the reporting of the results, 
and B is the number of sequence alignments returned in the reporting of the 
1 5 results (S.F. Altschul et al., 1990, J. Mol. Biol., 215:403-410). 

In another approach, nucleotide sequence identity can be calculated 
using the following equation: % identity = (number of identical nucleotides) / 
(alignment length in nucleotides) * 100. For this calculation, alignment 
length includes internal gaps but not includes terminal gaps. Alternatively, 
20 nucleotide sequence identity can be determined experimentally using the 
specific hybridization conditions described below. 

In accordance with the present invention, polynucleotide alterations 
are selected from the group consisting of at least one nucleotide deletion, 
substitution, including transition and transversion, insertion, or modification 
25 (e.g.. via RNA or DNA analogs). Alterations may occur at the 5' or 3' 
terminal positions of the reference nucleotide sequence or anywhere 
between those terminal positions, interspersed either individually among the 
nucleotides in the reference sequence or in one or more contiguous groups 
within the reference sequence. Alterations of a polynucleotide sequence of 
30 any one of SEQ ID NO:1 to SEQ ID NO:92 and SEQ ID NO:156 to SEQ ID 
NO:4687 may create nonsense, missense, or frameshift mutations in this 
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coding sequence, and thereby alter the polypeptide encoded by the 

polynucleotide following such alterations. 

Such altered nucleic acids, including DNA or RNA, can be detected 

and isolated by hybridization under high stringency conditions or moderate 
5 stringency conditions, for example, which are chosen to prevent 

hybridization of nucleic acids having non-complementary sequences. 

"Stringency conditions" for hybridizations is a term of art which refers to the 

conditions of temperature and buffer concentration which permit 

hybridization of a particular nucleic acid to another nucleic acid in which the 
10 first nucleic acid may be perfectly complementary to the second, or the first 

and second may share some degree of complementarity which is less than 

perfect. 

For example, certain high stringency conditions can be used which 
distinguish perfectly complementary nucleic acids from those of less 
15 complementarity. "High stringency conditions" and "moderate stringency 
conditions" for nucleic acid hybridizations are explained in F.M. Ausubel et 
al. (eds), 1995, Current Protocols in Molecular Biology, John Wiley and 
Sons, Inc., New York, NY, the teachings of which are hereby incorporated 
by reference. In particular, see pages 2.10.1-2.10.16 (especially pages 
2.10.8-2.10.11) and pages 6.3.1-6.3.6. The exact conditions which 
determine the stringency of hybridization depend not only on ionic strength, 
temperature and the concentration of destabilizing agents such as 
formamide, but also on factors such as the length of the nucleic acid 
sequence, base composition, percent mismatch between hybridizing 
25 sequences and the frequency of occurrence of subsets of that sequence 
within other non-identical sequences. Thus, high or moderate stringency 
conditions can be determined empirically. 

By varying hybridization conditions from a level of stringency at which 
no hybridization occurs to a level at which hybridization is first observed, 
30 conditions which will allow a given sequence to hybridize with the most 
similar sequences in the sample can be determined. Preferably the 
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hybridizing sequences will have 60-70% sequence identity, more preferably 
70-85% sequence identity, and even more preferably 90-100% sequence 
identity. 

Typically, the hybridization reaction is initially performed under 
5 conditions of low stringency, followed by washes of varying, but higher 
stringency. Reference to hybridization stringency, e.g., high, moderate, or 
low stringency, typically relates to such washing conditions. Hybridization 
conditions are based on the melting temperature (T m ) of the nucleic acid 
probe or primer and are typically classified by degree of stringency of the 
10 conditions under which hybridization is measured (Ausubel et al., 1995). 
For example, high stringency hybridization typically occurs at about 5-10% C 
below the T m ; moderate stringency hybridization occurs at about 10-20% 
below the T m ; and low stringency hybridization occurs at about 20-25% 
below the T m . The melting temperature can be approximated by the 
15 formulas as known in the art, depending on a number of parameters, such 
as the length of the hybrid or probe in number of nucleotides, or 
hybridization buffer ingredients and conditions. As a general guide. T m 
decreases approximately 1°C with every 1% decrease in sequence identity 
at any given SSC concentration. Generally, doubling the concentration of 
20 SSC results in an increase in T m of -17°C. Using these guidelines, the 
washing temperature can be determined empirically for moderate or low 
stringency, depending on the level of mismatch sought. 

High stringency hybridization conditions are typically carried out at 65 
to 68°C in 0.1 X SSC and 0.1% SDS. Highly stringent conditions allow 
25 hybridization of nucleic acid molecules having about 95 to 100% sequence 
identity. Moderate stringency hybridization conditions are typically carried 
out at 50 to 65°C in 1 X SSC and 0.1% SDS. Moderate stringency 
conditions allow hybridization of sequences having at least 80 to 95% 
nucleotide sequence identity. Low stringency hybridization conditions are 
30 typically carried out at 40 to 50°C in 6 X SSC and 0.1% SDS. Low 
stringency hybridization conditions allow detection of specific hybridization of 
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nucleic acid molecules having at least 50 to 80% nucleotide sequence 
identity. 

For example, high stringency conditions can be attained by 
hybridization in 50% formamide, 5 X Denhardt's solution, 5 X SSPE or SSC 
5 (1 X SSPE buffer comprises 0.15 M NaCl, 10 mM Na 2 HP0 4 . 1 mM EDTA; 1 
X SSC buffer comprises 150 mM NaCl, 15 mM sodium citrate, pH 7.0). 0.2% 
SDS at about 42°C, followed by washing in 1 X SSPE or SSC and 0.1% 
SDS at a temperature of at least 42°C, preferably about 55°C. more 
preferably about 65°C. Moderate stringency conditions can be attained, for 
10 example, by hybridization in 50% formamide, 5 X Denhardt's solution, 5 X 
SSPE or SSC. and 0.2% SDS at 42°C to about 50°C, followed by washing in 
0.2 X SSPE or SSC and 0.2% SDS at a temperature of at least 42°C, 
preferably about 55°C, more preferably about 65°C. Low stringency 
conditions can be attained, for example, by hybridization in 10% formamide, 
15 5 X Denhardt's solution, 6 X SSPE or SSC, and 0.2% SDS at 42"C, followed 
by washing in 1 X SSPE or SSC, and 0.2% SDS at a temperature of about 
45°C, preferably about 50°C in 4 X SSC at 60°C for 30 min. 

High stringency hybridization procedures typically (1) employ low 
ionic strength and high temperature for washing, such as 0.015 M NaCl/ 
20 0.0015 M sodium citrate, pH 7.0 (0.1 X SSC) with 0.1% sodium dodecyl 
sulfate (SDS) at 50°C; (2) employ during hybridization 50% (vol/vol) 
formamide with 5 X Denhardt's solution (0.1% weight/volume highly purified 
bovine serum albumin/0.1% wt/vol Ficotl/0.1% wt/vol polyvinylpyrrolidone). 
50 mM sodium phosphate buffer at pH 6.5 and 5 X SSC at 42°C; or (3) 
25 employ hybridization with 50% formamide, 5 X SSC. 50 mM sodium 
phosphate (pH 6.8), 0.1% sodium pyrophosphate, 5 X Denhardt's solution, 
sonicated salmon sperm DNA (50 ng/ml), 0.1% SDS, and 10% dextran 
sulfate at 42°C, with washes at 42°C in 0.2 X SSC and 0.1% SDS. 

In one particular embodiment, high stringency hybridization 
30 conditions may be attained by: 
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-Prehybridization treatment of the support (e.g., nitrocellulose filter or 
nylon membrane), to which is bound the nucleic acid capable of hybridizing 
with any of the sequences of the invention, is carried out at 65°C for 6 hr 
with a solution having the following composition: 4 X SSC, 10 X Denhardt's 
5 (1 X Denhardt's comprises 1% Ficoll, 1% polyvinylpyrrolidone, 1% BSA 
(bovine serum albumin); 1 X SSC comprises of 0.15 M of NaCI and 0.015 M 
of sodium citrate, pH 7); 

-Replacement of the pre-hybridization solution in contact with the 
support by a buffer solution having the following composition: 4 X SSC, 1 X 
10 Denhardt's, 25 mM NaPCU. pH 7, 2 mM EDTA, 0.5% SDS, 100 ug/ml of 
sonicated salmon sperm DNA containing a nucleic acid derived from the 
sequences of the invention as probe, in particular a radioactive probe, and 
previously denatured by a treatment at 100°C for 3 min; 
-Incubation for 12 hr at 65°C; 
1 5 -Successive washings with the following solutions: 1 ) four washings 

with 2 X SSC, 1 X Denhardt's, 0.5% SDS for 45 min at 65°C; 2) two 
washings with 0.2 X SSC, 0.1 X SSC for 45 min at 65°C; and 3) 0.1 x SSC, 
0.1% SDS for 45 min at 65°C. 

Additional examples of high, medium, and low stringency conditions 
20 can be found in Sambrook et al., 1989. Exemplary conditions are also 
described in M.H. Krause and S.A. Aaronson, 1991, Methods in 
Enzymology, 200:546-556; Ausubel et al., 1995. It is to be understood that 
the low, moderate and high stringency hybridization/washing conditions may 
be varied using a variety of ingredients, buffers, and temperatures well 
25 known to and practiced by the skilled practitioner. 

Isolated and/or recombinant nucleic acids that are characterized by 
their ability to hybridize to a) a nucleic acid encoding a chromosome 12q23- 
qter polypeptide, such as the nucleic acids depicted as SEQ ID NO:1 to 
SEQ ID NO:92; b) the complement of (a); c) or a portion of (a) or (b) (e.g.. 
30 under high or moderate stringency conditions), may further encode a protein 
or polypeptide having at least one function characteristic of a chromosome 
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12q23-qter polypeptide, such as Gene 702, a metalloprototease-like gene 
involved in inflammatory responses including tissue destruction and repair, 
or binding of antibodies that also bind to non- recombinant chromosome 
12q23-qter proteins or polypeptides. The catalytic or binding function of a 
protein or polypeptide encoded by the hybridizing nucleic acid may be 
detected by standard enzymatic assays for activity or binding (e.g.. assays 
that measure the binding of a transit peptide or a precursor, or other 
components of the translocation machinery). Enzymatic assays, 
complementation tests, or other suitable methods can also be used in 
procedures for the identification and/or isolation of nucleic acids which 
encode a polypeptide such as a polypeptide of the amino acid sequences 
SEQ ID NO:93 to SEQ ID NO:155. or a functional equivalent of these 
polypeptides. The antigenic properties of proteins or polypeptides encoded 
by hybridizing nucleic acids can be determined by immunological methods 
employing antibodies that bind to a chromosome 12q23-qter polypeptide 
such as immunoblot. immunoprecipitation and radioimmunoassay. PCR 
methodology, including RAGE (Rapid Amplification of Genomic DNA Ends), 
can also be used to screen for and detect the presence of nucleic acids 
which encode chromosome 12q23-qter gene-like proteins and polypeptides, 
and to assist in cloning such nucleic acids from genomic DNA. PCR 
methods for these purposes can be found in Innis. M.A.. et al.. 1990. PCR 
Protocols: A Guide to Methods and Applications, Academic Press. Inc.. San 
Diego, CA., incorporated herein by reference. 

It is understood that, as a result of the degeneracy of the genetic 
code, many nucleic acid sequences are possible which encode a 
chromosome 12q23-qter gene-like protein or polypeptide. Some of these 
will have little homology to the nucleotide sequences of any known or 
naturally-occurring chromosome 12q23-qter gene-like gene but can be used 
to produce the proteins and polypeptides of this invention by selection of 
) combinations of nucleotide triplets based on codon choices. Such vanants. 
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while not hybridizable to a naturally-occurring chromosome 12q23-qter 
gene, are contemplated within this invention. 

Also encompassed by the present invention are alternate splice 
variants produced by differential processing of the primary transcript(s) from 
5 12q23-qter genomic DNA. An alternate splice variant may comprise, for 
example, the sequence of any one of SEQ ID NO:1 to SEQ ID NO:5; SEQ 
ID NO:17 to SEQ ID NO:18; SEQ ID NO:36 to SEQ ID NO:37; SEQ ID 
NO:43 to SEQ ID NO:44; and SEQ ID NO:80 to SEQ ID NO:81. Alternate 
splice variants can also comprise other combinations of introns/exons of 
10 12q23-qter genes, which can be determined by those of skill in the art. 
Alternate splice variants can be determined experimentally, for example, by 
isolating and analyzing cellular RNAs (e.g., Southern blotting or PCR). or by 
screening cDNA libraries using the 12q23-qter nucleic acid probes or 
primers described herein. In another approach, alternate splice variants can 
15 be predicted using various methods, computer programs, or computer 
systems available to practitioners in the field. 

General methods for splice site prediction can be found in Nakata, 
1985, Nucleic Acids Res. 13:5327-5340. In addition, splice sites can be 
predicted using, for example, the GRAIL™ (E.C. Uberbacher and RJ. Mural. 
20 1991 , Proc. Natl. Acad. Sci. USA, 88:1 1261-1 1265; E.C. Uberbacher, 1995, 
Trends Biotech., 13:497-500; http://grail.lsd.oml.gov/grailexp); GenView (L. 
Milanesi et al., 1993, Proceedings of the Second International Conference 
on Bioinformatics, Supercomputing, and Complex Genome Analysis, H.A. 
Lim et al. (eds). World Scientific Publishing, Singapore, pp. 573-588; 
25 http://l25.itba.mi.cnr.it/-webgene/wwwgene_help.html); SpliceView 
(http://www. itba.mi.cnr.it/webgene); and HSPL (V.V. Solovyev et al.. 1994, 
Nucleic Acids Res. 22:5156-5163; V.V. Solovyev et al.. 1994. The 
Prediction of Human Exons by Oligonucleotide Composition and 
Discriminant Analysis of Spliceable Open Reading Frames." R. Altman et al. 
30 (eds), The Second International conference on Intelligent systems for 
Molecular Biology, AAA! Press, Menlo Park, CA, pp. 354-362; V.V. Solovyev 
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et al., 1993, "Identification Of Human Gene Functional Regions Based On 
Oligonucleotide Composition," L. Hunter et al. (eds), In Proceedings of First 
International conference cn Intelligent System for Molecular Biology. 
Bethesda, pp. 371-379) computer systems. 
5 Additionally, computer programs such as GeneParser (E.E. Snyder 

and G.D. Stormo. 1995. J. Mo/. Biol. 248: 1-18; E.E. Snyder and G.D. 
Stormo, 1993. NucL Acids Res. 21(3): 607-613; 
http://mcdb.colorado.edu/-eesnyder/ GeneParser.html); MZEF (M.Q. Zhang, 
1997, Proc. Natl. Acad. Sci. USA, 94:565-568; 
10 http://argon.cshl.org/genef.nder); MORGAN (S. Salzberg et al.. 1998. J. 
Comp. Biol. 5:667-680; S. Salzberg et al. (eds). 1998. Computational 
Methods in Molecular Biology, Elsevier Science, New York, NY. pp. 187- 
203); VEIL (J. Henderson et al.. 1997, J. Comp. Biol. 4:127-141); GeneScan 
(S. Tiwari et al., 1997, CABIOS (Biolnformatics) 13: 263-270); GeneBuilder 
15 (L. Milanesi et al., 1999. Bioinformatics 15:612-621); Eukaryotic GeneMark 
(J. Besemer et al., 1999, Nucl. Acids Res. 27:3911-3920); and FEXH (V.V. 
Solovyev et al.. 1994, Nucleic Acids Res. 22:5156-5163). In addition, splice 
sites (i.e., former or potential splice sites) in cDNA sequences can be 
predicted using, for example, the RNASPL (V.V. Solovyev et al.. 1994. 
20 Nucleic Acids Res. 22:5156-5163); or INTRON (A. Globek et al.. 1991. 
INTRON version 1.1 manual, Laboratory of Biochemical Genetics, NIMH. 

Washington, D.C.) programs. 

The present invention also encompasses naturally-occurring 
polymorphisms of 12q23-qter genes. As will be understood by those in the 

25 art, the genomes of all organisms undergo spontaneous mutation .n the 
course of their continuing evolution generating variant forms of gene 
sequences (Gusella. 1986. Ann. Rev. Biochem. 55:831-854). Restriction 
fragment length polymorphisms (RFLPs) include variations .n DNA 
sequences that alter the length of a restriction fragment in the sequence 

30 (Botstein et a.., 1980, Am. J. Hum. Genet. 32. 314-331). RFLPs have been 
widely used in human and animal genetic analyses (see WO 90/13668; 



31 

WO90/11369; Donis-Keller. 1987. Cell 51:319-337; Lander et al., 1989. 
Genetics 121: 85-99). Short tandem repeats (STRs) include tandem di-, tri- 
and tetranucleotide repeated motifs, also termed variable number tandem 
repeat (VNTR) polymorphisms. VNTRs have been used in identity and 
paternity analysis (U.S. Pat. No. 5,075.217; Armour et al.. 1992, FEBS Lett. 
307:113-115; Horn et al.. WO 91/14003; Jeffreys, EP 370,719), and in a 
large number of genetic mapping studies. 

Single nucleotide polymorphisms (SNPs) are far more frequent than 
RFLPS, STRs. and VNTRs. SNPs may occur in protein coding (e.g.. exon), 
or non-coding (e.g.. intron. 5'UTR, 3'UTR) sequences. SNPs in protein 
coding regions may comprise silent mutations that do not alter the amino 
acid sequence of a protein. Alternatively, SNPs in protein coding regions 
may produce conservative or non-conservative amino acid changes, 
described in detail below. In some cases. SNPs may give rise to the 
expression of a defective or other variant protein and. potentially, a genetic 
disease. SNPs within protein-coding sequences can give rise to genetic 
diseases, for example, in the p-globin (sickle cell anemia) and CFTR (cystic 
fibrosis) genes. In non-coding sequences, SNPs may also result in 
defective protein expression (e.g., as a result of defective splicing). Other 
single nucleotide polymorphisms have no phenotypic effects. 

Single nucleotide polymorphisms can be used in the same manner as 
RFLPs and VNTRs, but offer several advantages. Single nucleotide 
polymorphisms tend to occur with greater frequency and are typically 
spaced more uniformly throughout the genome than other polymorphisms. 
Also, different SNPs are often easier to distinguish than other types of 
polymorphisms (e.g., by use of assays employing allele-specific 
hybridization probes or primers). In one embodiment of the present 
invention, a 12q23-qter nucleic acid contains at least one SNP as set forth in 
Table 10, Figures 7A-7H; Figures 9A-9F; Figures 27A-27K; and Figures 
28A-28C, described herein. Various combinations of these SNPs are also 
encompassed by the invention. In a preferred aspect, a 12q23-qter SNP is 
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associated with a lung-related disorder, such as asthma. Nucleic acids 
comprising such SNPs can be used as diagnostic and/or therapeutic 
reagents. 

The nucleic acid sequences of the present invention may be derived 
5 from a variety of sources including DNA, cDNA, synthetic DNA, synthetic 
RNA, or combinations thereof. Such sequences may comprise genomic 
DNA, which may or may not include naturally occurring introns. Moreover, 
such genomic DNA may be obtained in association with promoter regions or 
poly(A)+ sequences. The sequences, genomic DNA, or cDNA may be 
10 obtained in any of several ways. Genomic DNA can be extracted and 
purified from suitable cells by means well known in the art. Alternatively, 
mRNA can be isolated from a cell and used to produce cDNA by reverse 
transcription or other means. 

The nucleic acids described herein are used in the methods of the 
15 present invention for production of proteins or polypeptides, through 
incorporation into cells, tissues, or organisms. In one embodiment, DNA 
containing all or part of the coding sequence for a 12q23-qter polypeptide, or 
DNA which hybridizes to DNA having the sequence of any one of SEQ ID 
NO:1 to SEQ ID NO:92 and SEQ ID NO:156 to SEQ ID NO:4687, or a 
20 fragment thereof, is incorporated into a vector for expression of the encoded 
polypeptide in suitable host cells. The encoded amino acid sequence 
consisting of a 12q23-qter polypeptide, or its functional equivalent is capable 
of normal activity, such as antigenic or intracellular activity. 

The invention also concerns the use of the nucleotide sequence of 
25 the nucleic acids of this invention to identify DNA p-obes for 12q23-qter 
genes, PCR primers to amplify 12q23-qter genes, nucleotide polymorphisms 
in 12q23-qter genes, and regulatory elements of 12q23-qter genes. 

The nucleic acids of the present invention find use as primers and 
templates for the recombinant production of disorder-associated peptides or 
30 polypeptides, for chromosome and gene mapping, to provide antisense 
sequences, for tissue distribution studies, to locate and obtain full length 
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genes, to identify and obtain homologous sequences (wild-type and 
mutants), and in diagnostic applications. The primers of this invention may 
comprise all or a portion of the nucleotide sequence of any one of SEQ ID 
NO:1 to SEQ ID NO:92, SEQ ID NO:156 to SEQ ID NO:4687, and the 
5 sequences set forth in Tables 8, 9. 11 A, and 11B, or a complementary 

sequence thereof. 

Probes may also be used for the detection of 12q23-qter-related 
sequences, and should preferably contain at least 50%, preferably at least 
80%, identity to a 12q23-qter polynucleotide, or a complementary sequence, 
10 or fragments thereof. The probes of this invention may be DNA or RNA, the 
probes may comprise all or a portion of the nucleotide sequence of any one 
of SEQ ID NO:1 to SEQ ID NO:92, SEQ ID NO:156 to SEQ ID NO:4687. 
and the sequences set forth in Tables 8, 9, 11 A, and 11B, or a 
complementary sequence thereof, and may include promoter, enhancer 
1 5 elements, and introns of the naturally occurring 1 2q23-qter polynucleotide. 

The probes and primers based on the 12q23-qter gene sequences 
disclosed herein are used to identify homologous 12q23-qter gene 
sequences and proteins in other species. These 12q23-qter gene 
sequences and proteins are used in the diagnostic/prognostic, therapeutic 
20 and drug-screening methods described herein for the species from which 
they have been isolated. 

VECTORS AND HOST CELLS 

The nucleic acids described herein are used in the methods of the 
present invention for production of proteins or polypeptides, through 

25 incorporation into cells, tissues, or organisms. In one embodiment, DNA 
containing all or part of the coding sequence for a chromosome 12q23-qter 
polypeptide, or DNA which hybridizes to DNA having the sequence SEQ ID 
NO:1 to SEQ ID NO:92 and SEQ ID NO: 156 to SEQ ID NO:4687, is 
incorporated into a vector for expression of the encoded polypeptide in 

30 suitable host cells. The encoded polypeptides consisting of chromosome 
12q23-qter genes, or their functional equivalents are capable of normal 



34 

activity, such as Gene 702, a metalloprotease-like gene involved in 
inflammatory responses including tissue destruction and repair. A large 
number of vectors, including bacterial, yeast, and mammalian vectors, have 
been described for replication and/or expression in various host cells or cell- 
5 free systems, and may be used for gene therapy as well as for simple 
cloning or protein expression. 

In one aspect, an expression vectors comprises a nucleic acid 
encoding a 12q23-qter polypeptide or peptide, as described herein, operably 
linked to at least one regulatory sequence. Regulatory sequences are 
10 known in the art and are selected to direct expression of the desired protein 
in an appropriate host cell. Accordingly, the term regulatory sequence 
includes promoters, enhancers and other expression control elements (see 
D.V. Goeddel, 1990, Methods Enzymol. 185:3-7). Enhancer and other 
expression control sequences are described in Enhancers and Eukaryotic 
15 Gene Expression, 1983. Cold Spring Harbor Press, Cold Spring Harbor, NY. 
It should ' be understood that the design of the expression vector may 
depend on such factors as the choice of the host cell to be transfected 
and/or the type of polypeptide to be expressed. 

Several regulatory elements (e.g., promoters) have been isolated and 
20 shown to be effective in the transcription and translation of heterologous 
proteins in the various hosts. Such regulatory regions, methods of isolation, 
manner of manipulation, etc. are known in the art. Non-limiting examples of 
bacterial promoters include the p-lactamase (penicillinase) promoter, lactose 
promoter; tryptophan (trp) promoter; araBAD (arabinose) operon promoter. 
25 lambda-derived Pi promoter and N gene ribosome binding site; and the 
hybrid tac promoter derived from sequences of the trp and lac UV5 
promoters. Non-limiting examples of yeast promoters include the 3- 
phosphoglycerate kinase promoter, glyceraldehyde-3-phosphate 
dehydrogenase (GAPDH) promoter, galactokinase (GAL1) promoter. 
30 galactoepimerase promoter, and alcohol dehydrogenase (ADH1) promoter. 
Suitable promoters for mammalian cells include, without limitation, viral 
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promoters, such as those from Simian Virus 40 (SV40), Rous sarcoma virus 
(RSV), adenovirus (ADV), and bovine papilloma virus (BPV). Preferred 
replication and inheritance systems include M13, ColE1. SV40. baculovirus. 
lambda, adenovirus, CEN ARS, 2um ARS and the like. While expression 
5 vectors may replicate autonomously, they may also replicate by being 
inserted into the genome of the host cell, by methods well known in the art. 

To obtain expression in eukaryotic cells, terminator sequences, 
polyadenylation sequences, and enhancer sequences that modulate gene 
expression may be required. Sequences that cause amplification of the 
10 gene may also be desirable. These sequences are well known in the art. 
Furthermore, sequences that facilitate secretion of the recombinant product 
from cells, including, but not limited to, bacteria, yeast, and animal cells, 
such as secretory signal sequences and/or preprotein or proprotein 
sequences, may also be included. Such sequences are well described in 
15 the art. 

Expression and cloning vectors will likely contain a selectable marker, 
a gene encoding a protein necessary for survival or growth of a host cell 
transformed with the vector. The presence of this gene ensures growth of 
only those host cells that express the inserts. Typical selection genes 
20 encode proteins that 1) confer resistance to antibiotics or other toxic 
substances, e.g.. ampicillin. neomycin, methotrexate, etc.; 2) complement 
auxotrophic deficiencies, or 3) supply critical nutrients not available from 
complex media, e.g., the gene encoding D-alanine racemase for Bacilli. 
Markers may be an inducible or non-inducible gene and will generally allow 
25 for positive selection. Non-limiting examples of markers include the 
ampicillin resistance marker (i.e., beta-lactamase). tetracycline resistance 
marker, neomycin/kanamycin resistance marker (i.e., neomycin 
phosphotransferase), dihydrofolate reductase, glutamine synthetase, and 
the like. The choice of the proper selectable marker will depend on the host 
30 cell, and appropriate markers for different hosts as understood by those of 
skill in the art. 
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Suitable expression vectors for use with the present invention 
include, but are not limited to, pUC, pBluescript (Stratagene), pET 
(Novagen, Inc., Madison, Wl), and pREP (Invitrogen) plasmids. Vectors can 
contain one or more replication and inheritance systems for cloning or 
5 expression, one or more markers for selection in the host, e.g., antibiotic 
resistance, and one or more expression cassettes. The inserted coding 
sequences can be synthesized by standard methods, isolated from natural 
sources, or prepared as hybrids. Ligation of the coding sequences to 
transcriptional regulatory elements (e.g., promoters, enhancers, and/or 
10 insulators) and/or to other amino acid encoding sequences can be carried 
out using established methods. 

Suitable cell-free expression systems for use with the present 
invention include, without limitation, rabbit reticulocyte lysate. wheat germ 
extract, canine pancreatic microsomal membranes, E. coli S30 extract, and 
15 coupled transcription/translation systems (Promega Corp.. Madison. Wl). 
These systems allow the expression of recombinant polypeptides or 
peptides upon the addition of cloning vectors. DNA fragments, or RNA 
sequences containing protein-coding regions and appropriate promoter 
elements. 

20 Non-limiting examples of suitable host cells include bacteria, archea, 

insect, fungi (e.g.. yeast), plant, and animal cells (e.g., mammalian, 
especially human). Of particular interest are Escherichia coli, Bacillus 
subtilis, Saccharomyces cerevisiae, SF9 cells, C129 cells. 293 cells. 
Neurospora, and immortalized mammalian myeloid and lymphoid cell lines. 

25 Techniques for the propagation of mammalian cells in culture are well- 
known (see, Jakoby and Pastan (eds), 1979, Cell Culture. Methods in 
Enzymology, volume 58, Academic Press, Inc., Harcourt Brace Jovanovich, 
NY). Examples of commonly used mammalian host cell lines are VERO and 
HeLa cells. CHO cells, and WI38. BHK. and COS cell lines, although it will 

30 be appreciated by the skilled practitioner that other cell lines may be used. 
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e.g.. to provide higher expression desirable glycosylation patterns, or other 
features. 

Host cells can be transformed, transfected, or infected as appropriate 
by any suitable method including electroporation, calcium chloride-, lithium 
5 chloride-, lithium acetate/polyethylene glycol-, calcium phosphate-, DEAE- 
dextran-, liposome-mediated DNA uptake, spheroplasting. injection, 
microinjection, microprojectile bombardment, phage infection, viral infection, 
or other established methods. Alternatively, vectors containing the nucleic 
acids of interest can be transcribed in vitro, and the resulting RNA 
10 introduced into the host cell by well-known methods, e.g., by injection (see, 
Kubo et al.. 1988, FEBS Letts. 241:119). The cells into which have been 
introduced nucleic acids described above are meant to also include the 
progeny of such cells. 

The nucleic acids of the invention may be isolated directly from cells. 
15 Alternatively, the polymerase chain reaction (PCR) method can be used to 
produce the nucleic acids of the invention, using either RNA (e.g.. mRNA) or 
DNA (e.g., genomic DNA) as templates. Primers used for PCR can be 
synthesized using the sequence information provided herein and can further 
be designed to introduce appropriate new restriction sites, if desirable, to 
20 facilitate incorporation into a given vector for recombinant expression. 

Using the information provided in SEQ ID NO:1 to SEQ ID NO:92 and 
SEQ ID NO:156 to SEQ ID NO:4687, one skilled in the art will be able to 
clone and sequence all representative nucleic acids of interest, including 
nucleic acids encoding complete protein-coding sequences. It is to be 
25 understood that non-protein-coding sequences contained within SEQ ID 
NO:156 to SEQ ID NO:693 and SEQ ID NO:694 to SEQ ID NO:979 are also 
within the scope of the invention. Such sequences include, without 
limitation, sequences important for replication, recombination, transcription, 
and translation. Non-limiting examples include promoters and regulatory 
30 binding sites involved in regulation of gene expression, and 5'- and 3- 
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untranslated sequences (e.g., ribosome-binding sites) that form part of 

mRNA molecules. 

The nucleic acids of this invention can be produced in large quantities 
by replication in a suitable host cell. Natural or synthetic nucleic acid 
5 fragments, comprising at least ten contiguous bases coding for a desired 
peptide or polypeptide can be incorporated into recombinant nucleic acid 
constructs, usually DNA constructs, capable of introduction into and 
replication in a prokaryotic or eukaryotic cell. Usually the nucleic acid 
constructs will be suitable for replication in a unicellular host, such as yeast 
10 or bacteria, but may also be intended for introduction to (with and without 
integration within the genome) cultured mammalian or plant or other 
eukaryotic cells, cell lines, tissues, or organisms. The purification of nucleic 
acids produced by the methods of the present invention is described, for 
example, in Sambrook et al., 1989; F.M. Ausubel et al.. 1992. Current 
. 1 5 Protocols in Molecular Biology, J. Wiley and Sons. New York, NY. 

The nucleic acids of the present invention can also be produced by 
chemical synthesis, e.g.. by the phosphoramidite method described by 
Beaucage et al.. 1981, Terra. Letts. 22:1859-1862, or the triester method 
according to Matteucci et al., 1981. J. Am. Chem. Soc, 103:3185. and can 
20 performed on commercial, automated oligonucleotide synthesizers. A 
double-stranded fragment may be obtained from the single-stranded product 
of chemical synthesis either by synthesizing the complementary strand and 
annealing the strands together under appropriate condrtions or by adding 
the complementary strand using DNA polymerase with an appropriate 

25 primer sequence. 

These nucleic acids can encode full-length variant forms of proteins 
as well as the wild-type protein. The variant proteins (which could be 
especially useful for detection and treatment of disorders) will have the 
variant amino acid sequences encoded by the polymorphisms described .n 

30 Table 10, when said polymorphisms are read so as to be in-frame w«th the 
full-length coding sequence of which it is a component. 
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Large quantities of the nucleic acids and proteins of the present 
invention may be prepared by expressing the 12q23-qter nucleic acids or 
portions thereof in vectors or other expression vectors in compatible 
prokaryotic or eukaryotic host cells. The most commonly used prokaryotic 
5 hosts are strains of Escherichia coli, although other prokaryotes, such as 
Bacillus subtilis or Pseudomonas may also be used. Mammalian or other 
eukaryotic host cells, such as those of yeast, filamentous fungi, plant, insect, 
or amphibian or avian species, may also be useful for production of the 
proteins of the present invention. For example, insect cell systems (i.e., 
10 lepidopteran host cells and baculovirus expression vectors) are particularly 
suited for large-scale protein production. 

Host cells carrying an expression vector (i.e., transformants or 
clones) are selected using markers depending on the mode of the vector 
construction. The marker may be on the same or a different DNA molecule, 
15 preferably the same DNA molecule. In prokaryotic hosts, the transformant 
may be selected, e.g.. by resistance to ampicillin, tetracycline or other 
antibiotics. Production of a particular product based on temperature 
sensitivity may also serve as an appropriate marker. 

Prokaryotic or eukaryotic cells comprising the nucleic acids of the 
20 present invention will be useful not only for the production of the nucleic 
acids and proteins of the present invention, but also, for example, in 
studying the characteristics of 12q23-qter proteins. Cells and animals that 
carry a 12q23-qter gene can be used as model systems to study and test for 
substances that have potential as therapeutic agents. The cells are typically 
25 cultured mesenchymal stem cells. These may be isolated from individuals 
with a somatic or germline 12q23-qter gene. Alternatively, the cell line can 
be engineered to carry a 12q23-qter gene, as described above. After a test 
substance is applied to the cells, the transformed phenotype of the cell is 
determined. Any trait of transformed cells can be assessed, including 
30 respiratory diseases including asthma, atopy, and response to application of 
putative therapeutic agents. 
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ANTISENSE NUCLEIC ACIDS 

A further embodiment of the invention is antisense nucleic acids or 
oligonucleotides which are complementary, in whole or in part, to a target 
molecule comprising a sense strand, and can hybridize with the target 

5 molecule. The target can be DNA, or its RNA counterpart (i.e., wherein T 
residues of the DNA are U residues in the RNA counterpart). When 
introduced into a cell, antisense nucleic acids or oligonucleotides can inhibit 
the expression of the gene encoded by the sense strand or the mRNA 
transcribed from the sense strand. Antisense nucleic acids can be produced 

10 by standard techniques. See. for example, Shewmaker, et al., U.S. Patent 
No. 5,107,065. 

In a particular embodiment, an antisense nucleic acid or 
oligonucleotide is wholly or partially complementary to and can hybridize 
with a target nucleic acid (either DNA or RNA), wherein the target nucleic 
1 5 acid can hybridize to a nucleic acid having the sequence of the complement 
of the strands in SEQ ID NO:1 to SEQ ID NO:92 and SEQ ID NO:156 to 
SEQ ID NO:4687. For example, an antisense nucleic acid or 
oligonucleotide can be complementary to a target nucleic acid having the 
sequence shown as the strand of the open reading frames SEQ ID NO:1 to 
20 SEQ ID NO:92 and SEQ ID NO:156 to SEQ ID NO:4687 or nucleic acids 
encoding functional equivalents of chromosome 12q23-qter genes, or to a 
portion of these nucleic acids sufficient to allow hybridization. A portion, for 
example a sequence of 16 nucleotides, could be sufficient to inhibit 
expression of the protein. Or, an antisense nucleic acid or oligonucleotide. 
25 complementary to 5' or 3' untranslated regions, or overlapping the 
translation initiation codons (5' untranslated and translated regions), of 
chromosome 12q23-qter genes, or genes encoding a functional equivalent 
can also be effective. In another embodiment, the antisense nucleic acid is 
wholly or partially complementary to and can hybridize with a target nucleic 
30 acid that encodes a chromosome 1 2q23-qter polypeptide. 



41 

In addition to the antisense nucleic acids of the invention, 
oligonucleotides can be constructed which will bind to duplex nucleic acids 
either in the genes or the DNA:RNA complexes of transcription, to form 
stable triple helix-containing or triplex nucleic acids to inhibit transcription 

5 and/or expression of a gene encoding a chromosome 12q23-qter gene, or 
their functional equivalents (Frank-Kamenetskii, M.D. and Mirkin, S.M., 
1995, Ann. Rev. Biochem. 64:65-95). Such oligonucleotides of the invention 
are constructed using the base-pairing rules of triple helix formation and the 
nucleotide sequences of the genes or mRNAs for chromosome 12q23-qter 

10 genes. 

In preferred embodiments, at least one of the phosphodiester bonds 
of an antisense oligonucleotide has been substituted with a structure that 
functions to enhance the ability of the compositions to penetrate into the 
region of cells where the RNA whose activity is to be modulated is located. 
15 It is preferred that such substitutions comprise phosphorothioate bonds, 
methyl phosphonate bonds, or short chain alkyl or cycloalkyl structures. In 
accordance with other preferred embodiments, the phosphodiester bonds 
are substituted with structures which are, at once, substantially non-ionic 
and non-chiral. or with structures which are chirai and enantiomerically 
20 specific. Persons of ordinary skill in the art will be able to select other 
linkages for use in the practice of the invention. 

Oligonucleotides may also include species that include at least some 
modified base forms. Thus, purines and pyridines other than those 
normally found in nature may be so employed. Similarly, modifications on 
the furanosyl portions of the nucleotide subunits may also be effected as 
long as the essential tenets of this invention are adhered to. Examples of 
such modifications are 2'-0-alkyl- and 2'-halogen-substrtuted nucleotides. 
Some non-limiting examples of modifications at the 2' position of sugar 
moieties which are useful in the present invention include OH. SH, SCH 3 . F. 
30 OCH 3 , OCN, 0(CH 2 ) n NH 2 and 0(CH 2 )„ CH 3 , where n is from 1 to about 10. 
Such oligonucleotides are functionally interchangeable with natural 
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oligonucleotides or synthesized oligonucleotides, which have one or more 
differences from the natural structure. All such analogs are comprehended 
by this invention so long as they function effectively to hybridize with "a 
12q23-qter nucleic acid to inhibit the function thereof. 
5 The oligonucleotides in accordance with this invention preferably 

comprise from about 3 to about 50 subunits. It is more preferred that such 
oligonucleotides and analogs comprise from about 8 to about 25 subunits 
and still more preferred to have from about 12 to about 20 subunits. As 
defined herein, a "subunit" is a base and sugar combination suitably bound 
10 to adjacent subunits through phosphodiester or other bonds. 

Antisense nucleic acids or oligonucleotides can be produced by 
standard techniques (see, e.g., Shewmaker et al., U.S. Patent No. 
5,107,065. The oligonucleotides used in accordance with this invention may 
be conveniently and routinely made through the well-known technique of 
15 solid phase synthesis. Equipment for such synthesis is available from 
several vendors, including PE Applied Biosystems (Foster City, CA). Any 
other means for such synthesis may also be employed, however, the actual 
synthesis of the oligonucleotides is well within the abilities of the practitioner. 
It is also will known to prepare other oligonucleotide such as 
20 phosphorothioates and alkylated derivatives. 

The oligonucleotides of this invention are designed to be hybridizable 
with 12q23-qter RNA (e.g., mRNA) or DNA. For example, an 
oligonucleotide (e.g., DNA oligonucleotide) that hybridizes to 12q23-qter 
mRNA can be used to target the mRNA for RnaseH digestion. Alternatively, 
25 an oligonucleotide that hybridizes to the translation initiation site of 12q23- 
qter mRNA can be used to prevent translation of the mRNA. In another 
approach, oligonucleotides that bind to the double-stranded DNA of 12q23- 
qter can be administered. Such oligonucleotides can form a triplex construct 
and inhibit the transcription of the DNA encoding 12q23-qter polypeptides. 
30 Triple helix pairing prevents the double helix from opening sufficiently to 
allow the binding of polymerases, transcription factors, or regulatory 
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molecules. Recent therapeutic advances using triplex DNA have been 
described (see, e.g., J.E. Gee et al.. 1994. Molecular and Immunologic 
Approaches, Futura Publishing Co., Mt. Kiscc, NY). 

As non-limiting examples, antisense oligonucleotides may be 
5 targeted to hybridize to the following regions: mRNA cap region; translation 
initiation site; translational termination site; transcription initiation site; 
transcription termination site; polyadenylation signal; 3' untranslated region; 
5* untranslated region; 5' coding region; mid coding region; and 3' coding 
region. Preferably, the complementary oligonucleotide is designed to 
10 hybridize to the most unique 5' sequence of a 12q23-qter gene, including 
any of about 15-35 nucleotides spanning the 5' coding sequence. 
Appropriate oligonucleotides can be designed using OLIGO software 
(Molecular Biology Insights, Inc., Cascade, CO; http://www.oligo.net). 

In accordance with the present invention, an antisense 
15 oligonucleotide can be synthesized, formulated as a pharmaceutical 
composition, and administered to a subject. The synthesis and utilization of 
antisense and triplex oligonucleotides have been previously described (e.g.. 
H. Simon et al.. 1999, Antisense Nucleic Acid Drug Dev. 9:527-31; F.X. 
Barre et al.. 2000, Proc. Natl. Acad. Sci. USA 97:3084-3088; R. Elez et al., 
20 2000, Biochem. Biophys. Res. Commun. 269:352-6; E.R. Sauter et al., 
2000, Clin. Cancer Res. 6:654-60). Alternatively, expression vectors 
derived from retroviruses, adenovirus, herpes or vaccinia viruses, or from 
various bacterial plasmids may be used for delivery of nucleotide sequences 
to the targeted organ, tissue or cell population. Methods which are well 
25 known to those skilled in the art can be used to construct recombinant 
vectors which will express nucleic acid sequence that is complementary to 
the nucleic acid sequence encoding a 12q23-qter polypeptide. These 
techniques are described both in Sambrook et al.. 1989 and in Ausubel et 
al., 1992. For example. 12q23-qter expression can be inhibited by 
30 transforming a cell or tissue with an expression vector that expresses high 
levels of untranslatable 12q23-qter sense or antisense sequences. Even in 
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the absence of integration into the DNA, such vectors may continue to 
transcribe RNA molecules until they are disabled by endogenous nucleases. 
Transient expression may last for a month or more with a nor. replicating 
vector, and even longer if appropriate replication elements included in the 

5 vector system. 

Various assays may be used to test the ability of antisense 
oligonucleotides to inhibit 12q23-qter gene expression. For example, 
12q23-qter mRNA levels can be assessed Northern blot analysis (Sambrook 
et al.. 1989; Ausubel et al., 1992; J.C. Alwine et al. 1977, Proc. Natl. Acad. 

10 Sci. USA 74:5350-5354; I.M. Bird, 1998, Methods Mol. Biol. 105:325-36), 
quantitative or semi-quantitative RT-PCR analysis (see, e.g., W.M. Freeman 
et al.. 1999, Biotechniques 26:112-122; Ren et al.. 1998, Mol. Brain Res. 
59:256-63; J.M. Cale et al.. 1998, Methods Mol. Biol. 105:351-71). or in situ 
hybridization (reviewed by A.K. Raap. 1998, Mutat. Res. 400:287-298). 

15 Alternatively, 12q23-qter polypeptide levels can be measured, e.g.. by 
western blot analysis, indirect immunofluorescence, immunoprecipitation 
techniques (see, e.g., J.M. Walker, 1998, Protein Protocols on CD-ROM, 
Humana Press. Totowa. NJ). 
POLYPEPTIDES 

20 The invention also relates to 12q23-qter proteins or polypeptides 

encoded by the nucleic acids described herein, e.g., SEQ ID NO:93 to SEQ 
ID NO:155, or portions or variants thereof. The proteins and polypeptides of 
this invention can be isolated and/or recombinant. In a preferred 
embodiment, the proteins or portions thereof have at least one function 

25 characteristic of a chromosome 12q23-qter protein or polypeptide, for 
example, Gene 702, a metalloprotease-like gene, the product of which is 
involved in inflammatory responses including, but not limited to tissue 
destruction and repair. These proteins are referred to as analogs, and the 
genes encoding them include, for example, naturally occurring chromosome 
30 12q23-qter genes, variants (e.g., mutants) encoding those proteins and/or 
portions thereof. Such protein or polypeptide variants include mutants 



10 



15 
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differing by the addition, deletion or substitution of one or more amino acid 
residues, or modified polypeptides in which one or more residues are 
modified (e.g., by phosphorylation, sulfation, deviation, etc.), and mutants 
comprising one or more modified residues. The variant can have 
"conservative" changes, wherein a substituted amino acid has similar 
structural or chemical properties, e.g., replacement of leucine with 
isoleucine. More infrequently, a variant can have "nonconservative" 
changes, e.g., replacement of a glycine with a tryptophan. Guidance in 
determining which amino acid residues can be substituted, inserted, or 
deleted without abolishing biological or immunological activity can be 
determined using computer programs well known in the art, for example, 
DNASTAR software (DNASTAR, Inc., Madison, Wl). 

As non-limiting examples, conservative substitutions in a 12q23-qter 
amino acid sequence can be made in accordance with the following table: 



Original 


Conservative 




Original 


Conservative 


Residue 


Substitution(s) 




Residue 


Substitution(s) 


Ala 


Ser 




Leu 


lie, Val 


Arg 


Lys 




Lys 


Arg, Gin, Glu 


Asn 


Gin, His 




Met 


Leu, lie 


Asp 


Glu 




Phe 


Met, Leu, Tyr 


Cys 


Ser 




Ser 


Thr 


Gin 


Asn 




Thr 


Ser 


Glu 


Asp 




Tip 


Tyr 


Gly 


Pro 




Tyr 


Tip, Phe 


His 


Asn, Gin 




Val 


lie, Leu 


lie 


Leu. Val 









Substantial changes in function or immunogenicity can be made by 
selecting substitutions that are less conservative than those shown in the 
table, above. For example, non-conservative substitutions can be made 
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which more significantly affect the structure of the polypeptide in the area of 
the alteration, for example, the alpha-helical, or beta-sheet structure; the 
charge or hydrophobic^ of the molecule at the target site; or the bulk of the 
side chain. The substitutions which generally are expected to produce the 
5 greatest changes in the polypeptide's properties are those where 1) a 
hydrophilic residue, e.g.. seryl or threonyl, is substituted for (or by) a 
hydrophobic residue, e.g., leucyl, isoleucyt, phenylalanyl, valyl, or alanyt; 2) 
a cysteine or proline is substituted for (or by) any other residue; 3) a residue 
having an electropositive side chain, e.g.. lysyl, arginyl. or histidyl. is 
10 substituted for (or by) an electronegative residue, e.g., glutamyl or aspartyl; 
or 4) a residue having a bulky side chain, e.g., phenylalanine, is substituted 
for (or by) a residue that does not have a side chain, e.g., glycine. 

In one embodiment, the percent amino acid sequence identity 
between a chromosome 12q23-qter polypeptide such as SEQ ID NO:93 to 
15 SEQ ID NO:155, and functional equivalents thereof is at least 50%. In a 
preferred embodiment, the percent amino acid sequence identity between 
such a chromosome 12q23-qter polypeptide and its functional equivalents is 
at least 65%. More preferably, the percent amino acid sequence identity 
between a chromosome 12q23-qter polypeptide and its functional 
20 equivalents is at least 75%, still more preferably, at least 80%. and even 

more preferably, at least 90%. 

Percent sequence identity can be calculated using computer 
programs or direct sequence comparison. Preferred computer program 
methods to determine identity between two sequences include, but are not 

25 limited to. the GCG program package. FASTA, BLASTP. and T3LASTN 
(see e.g.. D.W. Mount, 2001, Bioinformatics: Sequence and Genome 
Analysis, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY). 
The BLASTP and TBLASTN programs are publicly available from NCBI and 
other sources. The well-known Smith Waterman algorithm may also be 

30 used to determine identity. 
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Exemplary parameters for amino acid sequence comparison include 
the following: 1) algorithm from Needleman and Wunsch, 1970, J Mol. Biol. 
48:443-453; 2) BLOSSUM62 comparison matrix from Hentikoff arid 
Hentikoff, 1992, Proc. Natl. Acad. Sci. USA 89:10915-10919; 3) gap penalty 
5 =12; and 4) gap length penalty =4. A program useful with these parameters 
is publicly available as the "gap" program (Genetics Computer Group, 
Madison. Wl). The aforementioned parameters are the default parameters 
for polypeptide comparisons (with no penalty for end gaps). 

Alternatively, polypeptide sequence identity can be calculated using 
10 the following equation: % identity = (the number of identical residues) / 
(alignment length in amino acid residues) * 100. For this calculation, 
alignment length includes internal gaps but does not include terminal gaps. 

In accordance with the present invention, polypeptide sequences may 
be identical to the sequence of any one of SEQ ID NO:93 to SEQ ID 
15 NO: 155, or may include up to a certain integer number of amino acid 
alterations. Polypeptide alterations are selected from the group consisting 
of at least one amino acid deletion, substitution, including conservative and 
non-conservative substitution, or insertion. Alterations may occur at the 
amino- or carboxy-terminal positions of the reference polypeptide sequence 
20 or anywhere between those terminal positions, interspersed either 
individually among the amino acids in the reference sequence or in one or 
more contiguous groups within the reference sequence. 

In specific embodiments, a polypeptide variant may be encoded by a 
12q23-qter nucleic acid comprising a SNP and/or an alternate splice variant. 
25 For example, a polypeptide variant may be encoded by a 12q23-qter 
alternate splice variant comprising a nucleotide sequence of any one of SEQ 
ID NO:1 to SEQ ID NO:5; SEQ ID NO:17 to SEQ ID NO:18; SEQ ID NO:36 
to SEQ ID NO:37; SEQ ID NO:43 to SEQ ID NO:44; SEQ ID NO:80 to SEQ 
ID NO:81, or any of the alternate splice sequences set forth in Table 4. In 
30 addition, a polypeptide variant may be encoded by a nucleic acid containing 
one or more 12q23-qter SNPs as set forth in Table 10; Figures 7A-7H; 
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Figures 9A-9F; Figures 27A-27K; and Figures 28A-28C. Specific examples 
of amino acid changes encoded by 12q23-qter SNPs are provided in Table 
10, and are described in detail hereinbelow. 

The invention also relates to isolated, synthesized and/or 
5 recombinant portions or fragments of a 12q23-qter protein or polypeptide as 
described herein. Polypeptide fragments (i.e., peptides) can be made which 
have full or partial function on their own, or which when mixed together 
(though fully, partially, or nonfunctional alone), spontaneously assemble with 
one or more other polypeptides to reconstitute a functional protein having at 
10 least one functional characteristic of a 12q23-qter protein of this invention. 
In addition, 12q23-qter polypeptide fragments may comprise, for example, 
one or more domains of the 12q23-qter polypeptide, disclosed herein. In 
particular, a Gene 454 polypeptide may comprise one or more 
transmembrane, extracellular, or intracellular domains; a Gene 561 
15 polypeptide may comprise a SH3 domain and/or one or more fibronectin 
type III repeats; and a Gene 757 polypeptide may comprise a cysteine rich 
domain, a Ser/Thr-XXX-Val motif, and/or one or more transmembrane 
repeats (see below). 

Polypeptides according to the invention can comprise at least 5 
20 contiguous amino acid residues; preferably the polypeptides comprise at 
least 12 contiguous residues; more preferably the polypeptides comprise at 
least 20 contiguous residues; and yet more preferably the polypeptides 
comprise at least 30 contiguous residues. Nucleic acids comprising protein- 
coding sequences can be used to direct the expression of asthma- 
25 associated polypeptides in intact cells or in cell-free translation systems. 
The coding sequence can be tailored, if desired, for more efficient 
expression in a given host organism, and can be used to synthesize 
oligonucleotides encoding the desired amino acid sequences. The resulting 
oligonucleotides can be inserted into an appropriate vector and expressed in 
30 a compatible host organism or translation system. 
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The polypeptides of the present invention, including function- 
conservative variants, may be isolated from wild-type or mutant cells (e.g., 
human cells or cell lines), from heterologous organisms or cells (e.g., 
bacteria, yeast, insect, plant, and mammalian cells), or from cell-free 
5 translation systems (e.g., wheat germ, microsomal membrane, or bacterial 
extracts) in which a protein-coding sequence has been introduced and 
expressed. Furthermore, the polypeptides may be part of recombinant 
fusion proteins. The polypeptides can also, advantageously, be made by 
synthetic chemistry. Polypeptides may be chemically synthesized by 
10 commercially available automated procedures, including, without limitation, 
exclusive solid phase synthesis, partial solid phase methods, fragment 
condensation or classical solution synthesis. 

Methods for polypeptide purification are well-known in the art, 
including, without limitation, preparative disc-gel electrophoresis, isoelectric 
15 focusing, HPLC, reversed-phase HPLC, gel filtration, ion exchange and 
partition chromatography, and countercurrent distribution. For some 
purposes, it is preferable to produce the polypeptide in a recombinant 
system in which the protein contains an additional sequence (e.g., epitope 
or protein) tag that facilitates purification. Non-limiting examples of epitope 
20 tags include c-myc, hemagglutinin (HA), polyhistidine (6X-HIS) (SEQ ID 
NO: ), GLU-GLU. and DYKDDDDK (SEQ ID NO: ) (FLAG®) epitope tags. 
Non-limiting examples of protein tags include glutathione-S-transferase 
(GST), green fluorescent protein (GFP), and maltose binding protein (MBP). 
In one approach, the coding sequence of a polypeptide or peptide 
25 can be cloned into a vector that creates a fusion with a sequence tag of 
interest. Suitable vectors include, without limitation, pRSET (Invitrogen 
Corp., San Diego, CA), pGEX (Amersham-Pharmacia Biotech, Inc.. 
Piscataway, NJ), pEGFP (CLONTECH Laboratories, Inc., Palo Alto, CA), 
and pMAL™ (New England BioLabs (NEB), Inc., Beverly, MA) plasmids. 
30 Following expression, the epitope, or protein tagged polypeptide or peptide 
can be purified from a crude lysate of the translation system or host cell by 
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chromatography on an appropriate solid-phase matrix. In some cases, it 
may be preferable to remove the epitope or protein tag (i.e., via protease 
cleavage) following purification. As an alternative approach, antibodies 
produced against a disorder-associated protein or against peptides derived 
5 therefrom can be used as purification reagents. Other purification methods 

are also possible. 

The present invention also encompasses modifications of 12q23-qter 
polypeptides. The isolated polypeptides may be modified by, for example, 
phosphorylation, sulfation, acylation, or other protein modifications. They 
10 may also be modified with a label capable of providing a detectable signal, 
either directly or indirectly, including, but not limited to, radioisotopes and 
fluorescent compounds, as described in detail herein. 

Both the naturally occurring and recombinant forms of the 
polypeptides of the invention can advantageously be used to screen 
15 compounds for binding activity. Many methods of screening for binding 
activity are known by those skilled in the art and may be used to practice the 
invention. Several methods of automated assays have been developed in 
recent years so as to permit screening of tens of thousands of compounds in 
a short period of time. Such high-throughput screening methods are 
20 particularly preferred. The use of high-throughput screening assays to test 
for inhibitors is greatly facilitated by the availability of large amounts of 
purified polypeptides, as provided by the invention. The polypeptides of the 
invention also find use as therapeutic agents as well as antigenic 
components to prepare antibodies. 
25 The polypeptides of this invention find use as immunogenic 

components useful as antigens for preparing antibodies by standard 
methods. It is well known in the art that immunogenic epitopes generally 
contain at least 5 contiguous amino acid residues (Ohno et al., 1985. Proc. 
Natl. Acad. Sci. USA 82:2945). Therefore, the immunogenic components of 
30 this invention will typically comprise at least 5 contiguous amino acid 
residues of the sequence of the complete polypeptide chains. Preferably. 
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they will contain at least 7. and most preferably at least 10 contiguous amino 
acid residues or more to ensure that they will be immunogenic. Whether a 
given component is immunogenic can readily be determined by routine 
experimentation Such immunogenic components can be produced by 
proteolytic cleavage of larger polypeptides or by chemical synthesis or 
recombinant technology and are thus not limited by proteolytic cleavage 
sites. The present invention thus encompasses antibodies that specifically 
recognize asthma-associated immunogenic components. 

STRUCTURAL STUDIES 

A purified 12q23-qter polypeptide (e.g., SEQ ID NO:93 to SEQ ID 
NO: 155), or portions or complexes thereof, can be analyzed by well- 
established methods (e.g., X-ray crystallography, NMR, CD, etc.) to 
determine the three-dimensional structure of the molecule. The three- 
dimensional structure, in turn, can be used to model intermolecular 
interactions. Exemplary methods for crystallization and X-ray 
crystallography are found in P.G. Jones, 1981, Chemistry in Britain, 17:222- 
225; C. Jones et al. (eds), Crystallographic Methods and Protocols, Humana 
Press, Totowa, NJ; A. McPherson, 1982, Preparation and Analysis of 
Protein Crystals, John Wiley & Sons. New York, NY; T.L. Blundell and L.N. 
Johnson, 1976, Protein Crystallography, Academic Press, Inc., New York, 
NY; A. Holden and P. Singer, 1960, Crystals and Crystal Growing, Anchor 
Books-Doubleday, New York, NY; R.A. Laudise, 1970, The Growth of Single 
Crystals, Solid State Physical Electronics Series, N. Holonyak, Jr., (ed), 
Prentice-Hall, Inc.; G.H. Stout and L.H. Jensen, 1989, X-ray Structure 
Determination: A Practical Guide, 2nd edition, John Wiliey & Sons, New 
York, NY; Fundamentals of Analytical Chemistry, 3rd. edition, Saunders 
Golden Sunburst Series, Holt, Rinehart and Winston, Philadelphia, PA, 
1976; P.D. Boyle of the Department of Chemistry of North Carolina State 
University at http://laue.chem.ncsu.edu/web/GrowXtal.html; M.B. Berry. 
1995, Protein Crystalization: Theory and Practice, Structure and Dynamics 
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of E. coli Adenylate Kinase, Doctoral Thesis, Rice University, Houston TX; 
www.bioc.rice.edu/~berry/papers/crystalization/crystalization.html. 

For X-ray diffraction studies, single crystals can be grown to suitable 
size. Preferably, a crystal has a size of 0.2 to 0.4 mm in at least two of the 
5 three dimensions. Crystals can be formed in a solution comprising a 12q23- 
qter polypeptide (e.g.. 1.5-200 mg/ml) and reagents that reduce the solubility 
to conditions close to spontaneous precipitation. Factors that affect the 
formation of polypeptide crystals include: 1) purity; 2) substrates or co- 
factors; 3) pH; 4) temperature; 5) polypeptide concentration; and 6) 
10 characteristics of the precipitant. Preferably, the 12q23-qter polypeptides 
are pure, i.e., free from contaminating components (at least 95% pure), and 
free from denatured 12q23-qter polypeptides. In particular, polypeptides 
can be purified by FPLC and HPLC techniques to assure homogeneity (see. 
Lin et al.. 1992. J. Crystal. Growth. 122:242-245). Optionally. 12q23-qter 
15 polypeptide substrates or co-factors can be added to stabilize the 
quaternary structure of the protein and promote lattice packing. 

Suitable precipitants for crystallization include, but are not limited to. 
salts (e.g.. ammonium sulphate, potassium phosphate); polymers (e.g.. 
polyethylene glycol (PEG) 6000); alcohols (e.g.. ethanol); polyalcohols (e.g.. 
20 1-methyl-2.4 pentane diol (MPD)); organic solvents; sulfonic dyes; and 
deionized water. The ability of a salt to precipitate polypeptides can be 
generally described by the Hofmeister series: POf > HPO« 2 " = SO, 2 " > 
citrate > CH 3 C0 2 " > CT > Br" > NO3 > CIO4 > SCN"; and NH/ > K* > Na* > 
Lf . Non-limiting examples of salt precipitants are shown below (see Berry. 



25 



1995). 


Precipitant 


Maximum concentration 


(NKUVNaVLOa or Mg 2 +SCV 


4.0/1.5/2.1 /2.5 M 


NH 4 7Na7K* PO/ 


3.0/4.0/4.0 M 


NH 4 7K7Na7l_r citrate 


~ -1.8 M 


NH 4 7K7Na7l_r acetate 


-3.0 M 


NH 4 7K7Na7l_r CT 


5.2 / 9.8 / 4.2 / 5.4 M 
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| NH/N0 3 - | -8.0 M 

High molecular weight polymers useful as precipitating agents include 
polyethylene glycol (PEG), dextran, polyvinyl alcohol, and polyvinyl 
pyrrolidone (A. Poison et al., 1964, Biochem. Biophys. Acta. 82:463-475). In 
5 general, polyethylene glycol (PEG) is the most effective for forming crystals. 
PEG compounds with molecular weights less than 1000 can be used at 
concentrations above 40% v/v. PEGs with molecular weights above 1000 
can be used at concentration 5-50% w/v. Typically. PEG solutions are 
mixed with -0.I % sodium azide to prevent bacterial growth. 
10 Typically, crystallization requires the addition of buffers and a specific 

salt content to maintain the proper pH and ionic strength for a protein's 
stability. Suitable additives include, but are not limited to sodium chloride 
(e.g., 50-500 mM as additive to PEG and MPD; 0.15-2 M as additive to 
PEG); potassium chloride (e.g., 0.05-2 M); lithium chloride (e.g.. 0.05-2 M); 
15 sodium fluoride (e.g.. 20-300 mM); ammonium sulfate (e.g., 20-300 mM); 
lithium sulfate (e.g., 0.05-2 M); sodium or ammonium thiocyanate (e.g.. 50- 
500 mM); MPD (e.g., 0.5-50%); 1.6 hexane diol (e.g.. 0.5-10%); 1,2,3 
heptane triol (e.g., 0.5-15%); and benzamidine (e.g., 0.5-15%). 

Detergents may be used to maintain protein solubility and prevent 
20 aggregation. Suitable detergents include, but are not limited to non-ionic 
detergents such as sugar derivatives, oligoethyleneglycol derivatives, 
dimethylamine-N-oxides, chelate derivatives, N-octyl 

hydroxyalkylsulphoxides, sulphobetains, and lipid-like detergents. Sugar- 
derived detergents include alkyl glucopyranosides (e g., C8-GP, C9-GP), 
25 alkyl thio-glucopyranosides (e.g., C8-tGP), alkyl maltopyranosides (e.g., 
C10-M, C12-M; CYMAL-3, CYMAL-5. CYMAL-6), alkyl thio- 
maltopyranosides, alkyl galactopyranosides, alkyl sucroses (e.g.. N- 
octanoylsucrose). and glucamides (e.g., HECAMEG, C-HEGA-10; MEGA- 
8). Oligoethyleneglycol-derived detergents include alkyl polyoxyethylenes 
30 (e.g., C8-E5, C8-En; C12-E8; C12-E9) and phenyl polyoxyethylenes (e.g., 
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Triton X-100). Dimethylamine-N-oxide detergents include, e.g., C10-DAO; 
DDAO; LDAO. Cholate-derived detergents include, e.g., Deoxy-Big CHAP, 
digitonin. Lipid-like detergents include phosphocholine compound's. 
Suitable detergents further include zwitter-ionic detergents (e.g.. 
5 ZWITTERGENT 3-10; ZWITTERGENT 3-12); and ionic detergents (e.g.. 
SDS). 

Crystallization of macromolecules has been performed at 
temperatures ranging from 60°C to less than 0°C. However, most 
molecules can be crystallized at 4°C or 22°C. Lower temperatures promote 

10 stabilization of polypeptides and inhibit bacterial growth. In general, 
polypeptides are more soluble in salt solutions at lower temperatures (e.g., 
4°C), but less soluble in PEG and MPD solutions at lower temperatures. To 
allow crystallization at 4°C or 22°C, the precipitant or protein concentration 
can be increased or decreased as required. Heating, melting, and cooling of 

15 crystals or aggregates can be used to enlarge crystals. In addition, 
crystallization at both 4°C and 22°C can be assessed (A. McPherson, 1992, 
J. Cryst. Growth. 122:161-167; C.W. Carter. Jr. and C.W. Carter, 1979, J. 
Biol. Chem. 254:12219-12223; T. Bergfors, 1993. Crystalization Lab 
Manual). 

20 A crystallization protocol can be adapted to a particular polypeptide or 

peptide. In particular, the physical and chemical properties of the 
polypeptide can be considered (e.g., aggregation, stability, adherence to 
membranes or tubing, internal disulfide linkages, surface cysteines, 
chelating ions, etc.). For initial experiments, the standard set of 

25 crystalization reagents can be used (Hampton Research, Laguna Niguel, 
CA). In addition, the CRYSTOOL program can provide guidance in 
determining optimal crystallization conditions (Brent Segelke, 1995, 
Efficiency analysis of sampling protocols used in protein crystallization 
screening and crystal structure from two novel crystal forms of PLA2, Ph.D. 

30 Thesis, University of California, San Diego; http://www. 
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ccp14.ac.uk/ccp/web-mirrors/llnlmpp/crystool/crystool.htm). Exemplary 
crystallization conditions are shown below (see Berry, 1 995). 



Major Precipitant 


Additive 


Concentration of 
Major Precipitant 


Concentration 
of Additive 


(NH 4 )2S0 4 


PEG 400-2000. 

MPD. ethanol. or methanol 


2.0-4.0 M 


6%-0.5% 


Na citrate 


PEG 400-2000. 

MPD, ethanol, or methanol 


1.4-1.8 M 


6%-0.5% 


PEG 1000-20000 


(NH4)2S04. NaCI. 
or Na formate 


40-50% 


0.2-0.6 M 



5 Robots can be used for automatic screening and optimization of 

crystallization conditions. For example, the IMPAX and Oryx systems can 
be used (Douglas Instruments. Ltd.. East Garston. United Kingdom). The 
CRYSTOOL program (Segelke. supra) can be integrated with the robotics 
programming. In addition, the Xact program can be used to construct. 
10 maintain, and record the results of various crystallization experiments (see. 
eg DE Brodersen et al., 1999. J. Appl. Cryst 32: 1012-1016; G.R. 
Andersen and J. Nyborg. 1996. J. Appl. Cryst. 29:236-240). The Xact 
program supports multiple users and organizes the results of crysta.lizat.on 
experiments into hierarchies. Advantageously, Xact is compatible with both 
1 5 CRYSTOOL and Microsoft® Excel programs. 

Four methods are commonly employed to crystallize 
macromo.ecu.es: vapor diffusion, free interface diffusion, batch, and 
dialysis The vapor diffusion technique is typically performed by formulatmg 
a 1-1 mixture of a solution comprising the polypeptide of interest and a 
20 solution containing the precipitant at the final concentration that is to be 
achieved after vapor equilibration. The drop containing the 1:1 m.xture of 
protein and precipitant is then suspended and sealed over the well so.ut.on. 
which contains the precipitant at the target concentration, as e.ther a 
hanging or sitting drop. Vapor diffusion can be used to screen a large 
25 number of crystallization conditions or when small amounts of po.ypept.de 
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10 



15 



are available. For screening, drop sizes of 1 to 2 pi can be used. Once 
preliminary crystallization conditions have been determined, drop s,zes such 
as 10 pi can be used. Notably, results from hanging drops may be improved 
with agarose gels (see K. Provost and M.-C. Robert. 1991. J. Crysf. Growth. 
110-258-264) Free interface diffusion is performed by layenng of a low 
density solution onto one of higher density, usually in the form of 
concentrated protein onto concentrated salt. Since the so.ute to be 
crystallized must be concentrated, this method typically requires relatively 
iarge amounts of protein. However, the method can be adapted to work w,th 
small amounts of protein. In a representative experiment. 2 to 5 pi of 
sample is pipetted into one end of a 20 pi microcapillary pipet. Next. 2 to 
„ of precipitant is pipetted into the capillary without introducing an a,r 
bubble, and the ends of the pipe, are sealed. VWh sufficient amounts of 
protein, this method can be used to obtain relatively large crystals (see. e.g.. 
S M. Althoff et a!., 1988, J. Mot. Biol. 199:665-666). 

The batch technique is performed by mixing concentrated polypeptide 
wi ,h concentrated precipitant to produce a final concentration that ,s 
supersaturated for the so.ute macromolecule. Notably, th,s me ho can 
employ relatively large amounts of solution (e.g., milliliter quanfibes, ; and 
. can produce large crystals. For , ha, reason, the batch technique ,s no, 
recommended for screening initial crystallization condifions. 

The dialysis technique is performed by diffusing precipitant molecules 
trough a semipermeable membrane to slowly increase the 
the solute inside the membrane. Dialysis tubing can be -d *o d^yze 

5 m iter quantities of sample, whereas dialysis buttons car, *- • 

dia ,yze microliter quantities (e.g., 7-200 p.). D.alys.s buttons may be 

Taflnn^" (see eq., Cambridge 

sr - ~ - z 

In this way. polypeptides can be reused unui 
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crystallization are found (see. e.g.. C.W. Carter, Jr. et al.. 1988. J. Cryst. 
Growth. 90:60-73). However, this method is not recommended for 
precipitants comprising concentrated PEG solutions. 

Various strategies have been designed to screen crystallization 
conditions, including 1) pi screening; 2) grid screening; 3) factorials; 4) 
solubility assays; 5) perturbation; and 6) sparse matrices. In accordance 
with the pi screening method, the pi of a polypeptide is presumed to be ,ts 
crystallization point. Screening at the pi can be performed by dialys.s 
against low concentrations of buffer (less than 20 mM) at the appropnate 
pH or by use of conventional precipitants. 

The grid screening method can be performed on two-dimensional 
matrices. Typically, the precipitant concentration is plotted against pH. The 
optimal conditions can be determined for each axis, and then combined. At 
that point, additional factors can be tested (e.g.. temperature, addrt.ves). 
This method works best with fast-forming crystals, and can be read.ly 
automated (see M.J. Cox and P.C. Weber, 1988, J. Cryst. Growth. 90:318- 
324) Grid screens are commercially available for popular precipitants such 
as ammonium sulphate. PEG 6000, MPD, PEG/LiCi, and Nad (see. e.g.. 

Hamilton Research). 

The incomplete factorial method can be performed by 1) selecttng a 
set of -20 conditions; 2) randomly assigning combinations of these 
conditions; 3) grading the success of the results of each experiment using 
an objective scale; and 4) statistically evaluating the effects of each of the 
conditions on crystal formation (see, e.g., C.W. Carter, Jr. et a.., 1988. J. 
25 Cryst Growth. 90:60-73). In particular, conditions such as pH. temperature, 
precipitating agent, and cations can be tested. Dialysis buttons are 
preferably used with this method. Typically, optimal conditions/comb.naUons 
can be determined within 35 tests. Similar approaches, such as 
footprinting" conditions, may also be employed (see, e.g., E.A. Stura et al.. 
30 1991, J. Cryst. Growth. 110:1-2). 
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The perturbation approach car, be performed by altering 
crystallization conditions by introducing a series of additives designed to test 
the effects of altering the structure of bulk solvent and the solvent d,electrtc 
on crystal formation (see. e.g.. Whitaker et al.. 1995. Btoctem. 34:8221- 
5 8226) Additives for increasing the solvent dialectric include, but are not 
limited to. NaCI. KCI, or LiCI (e.g.. 200 mM); Na formate (e.g.. 200 mM); 
Na 2 HP0 4 or K 2 HP0 4 (e.g.. 200 mM): urea, triachloroacetate. guanid,um HCI. 
or KSCN (e.g.. 20-50 mM). A non-limiting list of additives for decreasing the 
solvent dialectric include methanol, ethanol. isopropanol. or tert-butanol 
10 (e.g.. 1-5%); MPD (e.g.. 1%); PEG 400, PEG 600, or PEG 1000 (e.g. 1- 
4%); PEG MME (monomethylether) 550, PEG MME 750. PEG MME 2000 

^ 9 ' an alternative to the above-screening methods, the sparse matrix 
approach can be used (see. e.g., J. Jancarik and S.-H.J. Kim. 1991. AppL 
15 Cryst 24:409-411; A. McPherson, 1992, J. Cryst. Growth. 122:161-167; B. 
Cudney et al.. 1994, Acta. Cryst. 050:414^23). Sparse matrix screens are 
commercially available (see. e.g.. Hampton Research; Molecular 
Dimensions. Inc., Apopka, FL; Emerald Structures. Inc., Lemont, 
Notably, data from Hampton Research sparse matrix screens can be stored 
20 and analyzed using ASPRUN software (Douglas Instruments). 

Exemplary conditions for an initial screen are shown below (see 

Berry. 1995). 

TABLE 1A: CRYST ALIZATION CONDITIONS 



Tray 1 




25 
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Tray 2: 



PEG 2000 


MME/0.2 M Ammon. sulfate (wells 25-30) 


25 


26 


27 


28 


29 


30 


25% 
pH 5.5 


25% 
pH 7.0 


25% 
pH 8.5 


40% 

pH 5.5 


40% 

pH7.0 


40% 

pH 8.5 


Random for wells 31 to 48 



The initial screen can be used with hanging or sitting drops. To 
conserve the sample, tray 2 can be set up several weeks following tray 1. 
5 Wells 31-48 of tray 2 can comprise a random set of solutions. Alternatively, 
solutions can be formulated using sparse methods. Preferably, test 
solutions cover a broad range of precipitants, additives, and pH (especially 
pH 5.0-9.0). 

Seeding can be used to trigger nucleation and crystal growth (Stura 
10 and Wilson. 1990. J. Cry*. Growth. 110:270-282; C. Thaller et al.. 1981. J. 
Mol. Biol. 147:465^69; A. McPherson and P. Schlichta. 1988. J. Cryst. 
Growth. 90:47-50). In general, seeding can performed by transferring 
crystal seeds into a polypeptide solution to allow polypeptide molecules to 
deposit on the surface of the seeds and produce crystals. Two seeding 
15 methods can be used: microseeding and macroseeding. For microseeding. 
a crystal can be ground into tiny pieces and transferred into the protein 
solution. Alternatively, seeds can be transferred by adding 1-2 ui of the 
seed solution directly to the equilibrated protein solution. In another 
approach, seeds can be transferred by dipping a hair in the seed solution 
20 and then streaking the hair across the surface of the drop (streak seeding; 
see Stura and Wilson, supra). For macroseeding, an intact crystal can be 
transferred into the protein solution (see. e.g.. C. Thaller et al.. 1981. J. Mol. 
Biol. 147:465-469). Preferably, the surface of the crystal seed is washed to 
regenerate the growing surface prior to being transferred. Optimally, the 
25 protein solution for crystallization is close to saturation and the crystal seed 
is not completely dissolved upon transfer. 
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ANTIBODIES 

Another aspect of the invention pertains to antibodies directed to 
12q23-qter polypeptides, or portions or variants thereof. The invention 
provides polyclonal and monoclonal antibodies that bind 12q23-qter 
5 polypeptides or peptides. The antibodies may be elicited in an animal host 
(e.g., rabbit, goat, mouse, or other non-human mammal) by immunization 
with disorder-associated immunogenic components. Antibodies may also 
be elicited by in vitro immunization (sensitization) of immune cells. The 
immunogenic components used to elicit the production of antibodies may be 
10 isolated from cells or chemically synthesized. The antibodies may also be 
produced in recombinant systems programmed with appropriate antibody- 
encoding DNA. Alternatively, the antibodies may be constructed by 
biochemical reconstitution of purified heavy and light chains. The antibodies 
include hybrid antibodies, chimeric antibodies, and univalent antibodies. 
15 Also included are Fab fragments, including Fab 1 and Fab(ab) 2 fragments of 
antibodies. 

In accordance with the present invention, antibodies are directed to a 
12q23-qter polypeptide (e.g., SEQ ID NO:93 to SEQ ID NO:155). or 
variants, or portions thereof. For example, antibodies can be produced to 

20 bind to a 12q23-qter polypeptide encoded by an alternate splice variant 
comprising a nucleotide sequence of any one of SEQ ID NO:1 to SEQ ID 
NO:5; SEQ ID NO:17 to SEQ ID NO:18; SEQ ID NO:36 to SEQ ID NO:37; 
SEQ ID NO:43 to SEQ ID NO:44; SEQ ID NO:80 to SEQ ID NO:81; or any 
of the alternate splice sequences set forth in Table 4. As another example. 

25 antibodies can be produced to bind to a 12q23-qter polypeptide variant 
encoded by a nucleic acid containing one or more 12q23-qter SNPs as set 
forth in Table 10; Figures 7A-7H; Figures 9A-9F; Figures 27A-27K; and 
Figures 28A-28C. Such antibodies can be used as diagnostic and/or 

therapeutic reagents. 
30 An isolated 12q23-qter polypeptide (e.g., SEQ ID NO:93 to SEQ ID 

NO: 155), or variant, or portion thereof, can be used as an immunogen to 
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generate antibodies using standard techniques for polyclonal and 
monoclonal antibody preparation. A full-length 12q23-qter polypeptide can 
be used or, alternatively, the invention provides antigenic peptide portions of 
12q23-qter for use as immunogens. The antigenic peptide of 12q23-qter 
5 comprises at least 5 contiguous amino acid residues of the amino acid 
sequence shown in any one of SEQ ID NO:93 to SEQ ID NO:155. or a 
variant thereof, and encompasses an epitope of a 12q23-qter polypeptide 
such that an antibody raised against the peptide forms a specific immune 
complex with A 12q23-qter amino acid sequence. 
10 An appropriate immunogenic preparation can contain, for example, 

recombinant^ produced 12q23-qter polypeptide or a chemically synthesized 
12q23-qter polypeptide, or portions thereof. The preparation can further 
include an adjuvant, such as Freund's complete or incomplete adjuvant, or 
similar immunostimulatory agent. A number of adjuvants are known and 
1 5 used by those skilled in the art. Non-limiting examples of suitable adjuvants 
include incomplete Freund's adjuvant, mineral gels such as alum, aluminum 
phosphate, aluminum hydroxide, aluminum silica, and surface-active 
substances such as lysolecithin, pluronic polyols, polyanions, peptides, oil 
emulsions, keyhole limpet hemocyanin, and dinitrophenol. Further 
20 examples of adjuvants include N-acetyl-muramyl-L-threonyl-D-isoglutamine 
(thr-MDP), N-acetyl-nor-muramyl-L-alanyl-D-isoglutamine (CGP 11637, 
referred to as nor-MDP). N-acetylmuramyl-Lalanyl-D-isoglutaminyl-L- 
alanine-2-(r-2'-dipalmitoyl-sn-glycero-3 hydroxyphosphoryloxy)-ethyiamine 
(CGP 19835A. referred to as MTP-PE), and RIBI. which contains three 
components extracted from bactoria, monophosphoryl lipid A, trehalose 
dimycolate and cell wall skeleton (MPL+TDM+CWS) in a 2% 
squaleneATween 80 emulsion. A particularly useful adjuvant comprises 5% 
(wt/vol) squalene. 2.5% Pluronic L121 polymer and 0.2% polysorbate in 
phosphate buffered saline (Kwak et al., 1992, New Eng. J. Med. 327:1209- 
30 1215) Preferred adjuvants include complete BCG. Detox. (RIBI, 
Immunochem Research Inc.), ISCOMS, and aluminum hydroxide adjuvant 
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(Superphos, Biosector). The effectiveness of an adjuvant may be 
determined by measuring the amount of antibodies directed against the 

immunogenic peptide. 

Polyclonal antibodies to 12q23-qter polypeptides can be prepared as 
5 described above by immunizing a suitable subject with a 12q23-qter 
immunogen. The antibody titer in the immunized subject can be monitored 
over time by standard techniques, such as with an enzyme linked 
immunosorbent assay (ELISA) using immobilized 12q23-qter polypeptide or 
peptide. If desired, the antibody molecules can be isolated from the 
10 mammal (e.g.. from the blood) and further purified by well-known 
techniques, such as protein A chromatography to obtain the IgG fraction. 

At an appropriate time after immunization, e.g., when the antibody 
titers are highest, antibody-producing cells can be obtained from the subject 
and used to prepare monoclonal antibodies by standard techniques, such as 
15 the hybridoma technique (see Kohler and Milstein, 1975. Nature 256:495- 
497; Brown et al., 1981. J. Immunol. 127:539-46; Brown et al.. 1980. J. Biol. 
Chem. 255:4980-83; Yeh et al., 1976. PNAS 76:2927-31; and Yeh et al.. 
1982. Int. J. Cancer 29:269-75), the human B cell hybridoma technique 
(Kozbor et al., 1983, Immunol. Today 4:72). the EBV-hybridoma technique 
20 (Cole et al.. 1 985, Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, 
Inc., pp. 77-96) or trioma techniques. 

The technology for producing hybridomas is well-known (see 
generally R. H. Kenneth. 1980. Monoclonal Antibodies: A New Dimension In 
Biological Analyses, Plenum Publishing Corp.. New York, NY; E.A. Lemer. 
25 1981 . Yale J. Biol. Med., 54:387-402; M.L. Gefter et al.. 1977. Somatic Cell 
Genet. 3:231-36). In general, an immortal cell line (typically a myeloma) is 
fused to lymphocytes (typically splenocytes) from a mammal immunized with 
a 12q23-qter immunogen as described above, and the culture supernatants 
of the resulting hybridoma cells are screened to identify a hybndoma 
30 producing a monoclonal antibody that binds 12q23-qter polypeptides or 
peptides. 
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Any of the many well known protocols used for fusing lymphocytes 
and immortalized cell lines can be applied for the purpose of generating an 
monoclonal antibody to a 12q23-qter polypeptide (see. e.g., G. Galfre et 
1977 Nafure 266:55052; Getter et al.. 1977; Lemer. 1981; Kenneth. 1980). 
Moreover, the ordinarily skilled worker will appreciate that there are many 
variations of such methods. Typically, the immortal cell line (e.g.. a 
myeloma cell line) is derived from the same mammalian species as the 
lymphocytes. For example, murine hybridomas can be made by fusing 
lymphocytes from a mouse immunized with an immunogenic preparation of 
, the present invention with an immortalized mouse cell line. Preferred 
immortal cell lines are mouse myeloma cell lines that are sensitive to culture 
medium containing hypoxanthine, aminopterin, and thymidine (HAT 
medium). Any of a number of myeloma cell lines can be used as a fusion 
partner according to standard techniques, e.g.. the P3-NS1/1-Ag4-1. P3- 
5 x63-Ag8.653, or Sp2/0-Ag14 myeloma lines. These myeloma lines are 
available from ATCC (American Type Culture Collection, Manassas, VA). 
Typically, HAT-sensitive mouse myeloma cells are fused to mouse 
splenocy.es using polyethylene glycol (PEG). Hybridoma cells resulting 
from the fusion arc then selected using HAT medium, which kills unfused 
>0 and unproductive^ fused myeloma cells (unfused splenocytes die after 
several days because they are not transformed). Hybridoma cells producing 
a monoclonal antibody of the invention are detected by screening the 
hybridoma culture supematants for antibodies that bind 12q23-<,ter 
polypeptides or peptides, e.g., using a standard ELISA assay. 
25 Alternative to preparing monoclonal antibody-secreting hybndomas, a 

monoclonal antibody can be identffied and isolated by screening a 
recombinant combinatorial immunoglobulin library (e.g.. an antibody phage 
display library) with the corresponding 12 q 23-qter polypeptide to thereby 
isolate immunoglobulin library members that bind the polypeptide. Krts for 
30 generating and screening phage display libraries are commercially availab e 
(e g the Pharmacia Recombinant Phage Antibody System. Catalog No. 27- 
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9400-01; and the Stratagene SurfZAP™ Phage Display Kit. Catalog No. 
240612). 

Additionally, examples of methods and reagents particularly 
amenable for use in generating and screening antibody display library can 
be found in, for example, Ladner et al. U.S. Pat. No. 5.223.409; Kang et al. 
PCT International Publication No. WO 92/18619; Dower et al. PCT 
International Publication No. WO 91/17271; Winter et al. PCT International 
Publication WO 92/20791; Markland et al. PCT International Publication No. 
WO 92/15679; Breitling et al. PCT International Publication WO 93/01288; 
McCafferty et al. PCT International Publication No. WO 92/01047; Garrard 
et al. PCT International Publication No. WO 92/09690; Ladner et al. PCT 
International Publication No. WO 90/02809; Fuchs et al.. 1991. 
Bio/Technology 9:1370-1372; Hay et al.. 1992, Hum. Antibod. Hybridomas 
3-81-85- Huse et al., 1989, Science 246:1275-1281; Griffiths et al.. 1993. 
EMBO J 12:725-734; Hawkins et al., 1992. J. MoL Biol. 226:889-896; 
Clarkson et al.. 1991. Nature 352:624-628; Gram et al.. 1992. PNAS 
893576-3580; Garrad et al.. 1991, Bio/Technology 9:1373-1377; 
Hoogenboom et al.. 1991. Nuc. Acid Res. 19:4133-4137; Barbas et al.. 
1991 PNAS 88:7978-7982; and McCafferty et al.. 1990, Nature 348:552-55. 

' Additionally, recombinant antibodies to a 12q23-qter polypeptide, 
such as chimeric and humanized monoclonal antibodies, comprising both 
human and non-human portions, can be made using standard recomb.nant 
DNA techniques. Such chimeric and humanized monoclonal antibodies can 
be produced by recombinant DNA techniques known in the art, for example 
using methods described in Robinson et al. .nternationai Application No. 
PCT/US86/02269; Akira. et al. European Patent Application 184,187; 
Taniguchi. M., European Patent Application 171.496; Morrison et al. 
European Patent Application 173,494; Neuberger et al. PCT Internationa. 
Publication No. WO 86/01533; Cabi.ly et al. U.S. Pat. No. 4,816,567; Cabilly 
30 et al European Patent Application 125,023; Better et al., 1988. Science 
240-1041-1043; Liu et al.. 1987. PNAS 84:3439-3443; Liu et a... 1987. J. 
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Immunol. 139:3521-3526; Sun et al., 1987, PNAS 84:214-218; Nishimura et 
al.. 1987, Cane. Res. 47:999-1005; Wood et al., 1985, Nature 314:446-449; 
and Shaw et al.. 1988, J. Natl. Concsr Inst. 80:1553-1559; S.L. Morrison. 
1985, Science 229:1202-1207; Oi et al.. 1986. BioTechniques 4:214; Winter 
5 U.S. Pat. No. 5,225,539; Jones et al.. 1986. Nature 321:552-525; Verhoeyan 
et al.. 1988. Science 239:1534; and Bcidler et al.. 1988. J. Immunol. 
141:4053-4060. 

An antibody against a 12q23-qter polypeptide (e.g.. monoclonal 
antibody) can be used to isolate the corresponding polypeptide by standard 
10 techniques, such as affinity chromatography or immunoprecipitation. For 
example, antibodies can facilitate the purification of a natural 12q23-qter 
polypeptide from cells and of a recombinant^ produced 12q23-qter 
polypeptide or peptide expressed in host cells. In addition, an antibody that 
binds to a 12q23-qter polypeptide can be used to detect the corresponding 
1 5 protein (e.g.. in a cellular lysate or cell supernatant) in order to evaluate the 
abundance and pattern of expression of the protein. Such antibodies can 
also be used diagnostically to monitor 12q23-qter protein levels in tissue as 
part of a clinical testing procedure, e.g., to, for example, determine the 
efficacy of a given treatment regimen as described in detail herein. In 
20 addition, antibodies to a 12q23-qter polypeptide can be used as therapeutics 
for the treatment of diseases related to abnormal 12q23-qter gene 
expression or function, e.g., asthma. 
LIGANDS 

The 12q23-qter polypeptides (e.g.. SEQ ID NO:93 to SEQ ID 
25 NO-155), polynucleotides (e.g., SEQ ID NO:1 to SEQ ID NO:92 and SEQ ID 
NO-1 56 to SEQ ID NO:4687). variants, or fragments or portions thereof, can 
be used to screen for ligands (e.g.. agonists, antagonists, or inhibitors) that 
modulate the levels or activity of the 12q23-qter polypeptide. In addition, 
these 12q23-qter molecules can be used to identify endogenous ligands that 
30 bind to 12q23-qter polypeptides or polynucleotides in the cel.. In one aspect 
of the present invention, the full-length 12q23-qter polypeptide (e.g.. SEQ ID 



66 

NO:93 to SEQ ID NO:155) is used to identify ligands. Alternatively, variants 
or portions of a 12q23-qter polypeptide are used. Such portions may 
comprise, for example, one or more domains of the 12q23-qter polypeptide 
(e.g., transmembrane, intracellular, extracellular, SH3. f.bronectin III repeat, 
5 cysteine-rich, and Ser/Thr-XXX-Val domains) disclosed herein. Of particular 
interest are screening assays that identify agents that have relatively low 
levels of toxicity in human cells. A wide variety of assays may be used for 
this purpose, including in vitro protein-protein binding assays, 
electrophoretic mobility shift assays, immunoassays, and the like. 
10 Ligands that bind to the 12q23-qter polypeptides or polynucleotides of 

the invention are potentially useful in diagnostic applications and/or 
pharmaceutical compositions, as described in detail herein. Ligands may 
encompass numerous chemical classes, though typically they are organic 
molecules, e.g.. small molecules. Preferably, small molecules have a 
15 molecular weight of less than 5000 daltons, more preferably, small 
molecules have a molecular weight of more than 50 and less than 2,500 
daltons. Such molecules can comprise functional groups necessary for 
structural interaction with proteins, particularly hydrogen bonding, and 
typically include at least an amine, carbonyl. hydroxyl or carboxyl group. 
20 preferably at least two of the functional chemical groups. Useful molecules 
often comprise cyclical carbon or heterocyclic structures and/or aromatic or 
polyaromatic structures substituted with one or more of the above functional 
groups. Such molecules can also comprise biomolecules including 
peptides, saccharides, fatty acids, steroids, purines, pyridines, derivatives. 
25 structural analogs, or combinations thereof. 

Ligands may include, for example. 1) peptides such as soluble 
peptides, including Ig-tailed fusion peptides and members of random peptide 
.ibraries (see, e.g.. Lam et al.. 1991, Nature 354:82-84; Houghten et al.. 
1991 Nature 354:84-86) and combinatorial chemistry-derived molecular 
30 libraries made of D- and/or L-conf.gu ration amino acids; 2) phosphopeptides 
(eg members of random and partially degenerate, d.rected 
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phosphopeptide libraries, see, e.g., Songyang et al, 1993. Cell 72:767-778); 
3) antibodies (e.g.. polyclonal, monoclonal, humanized, anti-idiotyp.c, 
chimeric, and single chain antibodies as well as Fab. F^b'fc. Fab 
expression library fragments, and epitope-binding fragments of antibod.es); 
and 4) small organic and inorganic molecules. 

Test agents useful for identifying 12q23-qter ligands can be obtained 
from a wide variety of sources including libraries of synthetic or natural 
compounds. Synthetic compound libraries are commercially available from, 
for example. Maybridge Chemical Co. (Trevillet. Cornwall. UK). Comgenex 
(Princeton. NJ). Brandon Associates (Merrimack, NH), and Microsource 
(New Milford, CT). A rare chemical library is available from Aldrich Chemical 
Company. Inc. (Milwaukee. Wl). Natural compound libraries comprising 
bacterial, fungal, plant or animal extracts are available from, for example. 
Pan Laboratories (Bothell, WA). In addition, numerous means are ava.lable 
for random and directed synthesis of a wide variety of organic compounds 
and biomolecules. including expression of randomized oligonucleotides. 

Alternatively, libraries of natural compounds in the form of bactenal. 
fungal, plant and animal extracts can be readily produced. Methods for the 
synthesis of molecular libraries are readily available (see. e.g.. DeW.tt et al.. 
1993 Proc. Natl. Acad. Sci. USA 90:6909; Erb et al., 1994, Proc. Natl. Acad. 
Sci USA 91:11422; Zuckermann et al., 1994. J. Med. Chem. 37:2678; Cho 
et al 1993. Science 261:1303; Carell et al., 1994. Angew. Chem. Int. Ed. 
Engl 33 2059; Carell et al.. 1994, Angew. Chem. Int. Ed. Engl. 33:2061; and 
in Gallop et al.. 1994, J. Med. Chem. 37:1233). in addition, natural or 
synthetic compound libraries and compounds can be readily modrfied 
through conventional chemical, physical and biochemical means (see. e.g.. 
Blondelle et al.. 1996. Trends in Biotech. 14:60). and may be used to 
produce combinatorial libraries. In another approach, previously identtfed 
pharmacological agents can be subjected to directed or random chem.ca. 
30 modifications, such as acylation, alkylation, esterif.cation, amidrf.cation, and 
the analogs can be screened for 12q23-qter gene-modulating actrvrty. 



25 



10 



68 

Numerous methods for producing combinatorial libraries are known in 
the art including those involving biological libraries; spatially addressable 
parallel solid phase or solution phase libraries; synthetic library methods 
requiring deconvolution; the 'one-bead one-compound' library method; and 
synthetic library methods using affinity chromatography select.cn. The 
biological library approach is limited to polypeptide libraries, while the other 
four approaches are applicable to polypeptide, non-peptide oligomer, or 
small molecule libraries of compounds (K. S. Lam. 1997. Anticancer Drug 

Des. 12:145). . 

Non-limiting examples of small molecules, small molecule Ubranes. 
combinatorial libraries, and screening methods are described m B. 
Seligmann. 1995. "Synthesis. Screening. Identification of Port* 
Compounds and Optimization of Leads from Combinatorial Libranes: 
Validation of Success" p. 69-70. Symposium: W Mo,ecu,ar D,vens«y- 
Sm a« Molecule Libraries for Drug Discovery. La Jolla. CA. Jan. 23-25. 1995 
(conference summary available from Wendy Warr & Associates e ; Berwick 
Court. Cheshire, UK CW4 7HZ); E. Martin et a... 1995. J. Med. Cnem 
381431-1436; E. Martin et al.. 1995. "Measuring diversity: Expenmental 
design o, combinatorial libraries for drug discover/ Abstract. ACS Meefing 
20 Anaheim. CA. COMP 32; and E. Martin. 1995, "Measunng Chem.ca. 
Diversity: Random Screening or Rationale Library Des,gn p. 27-30 
SymZum: B**, Mo,ecu,ar Divert Small M o,ecu,e Ubranes for 
Lg Discovery. La Jol.a, Calif. Jan. 23-25. 1995 (conference summary 
avalble from Wendy Warr & Associates. 6 Berwick Court, Chesh,re. UK 

25 CW4 7HZ). 1PQ2 
Libraries may be screened in so.ution (e.g., Houghton. 1992. 

B,o»es 13:412-421). or on beads (Lam, 1991, Nature^ -84), 
chips (Fodor. 1993. Nature 364:555-556), bacteria or spores (Ladner 
Pa Nd. 5.223.409), plasmids (Cull et a,., 1992. Proc. "a, 
30 89:1865-1869), or on phage (Scott and Smith ^""J"^. 
Devlin, 1990. Science 249:404-406; Cwrla et al., 1990, Proc. 
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Sci. USA 97:6378-6382; Feiici. 1991. J. Mo/. Biol. 222:301-310; Ladner, 
supra). 

Where the screening assay is a binding assay, a 12q23-qter 
polypeptide, polynucleotide, analog, or fragment thereof, may be joined to a 
label, where the label can directly or indirectly provide a detectable signal. 
Various labels include radioisotopes, fluoresces, chemiluminescers, 
enzymes, specific binding molecules, particles, e.g., magnetic particles, and 
the like. Specific binding molecules include pairs, such as biotin and 
streptavidin, digoxin and antidigoxin, etc. For the specific binding members. 
) the complementary member would normally be labeled with a molecule that 
provides for detection, in accordance with known procedures. 

A variety of other reagents may be included in the screening assay. 
These include reagents like salts, neutral proteins, e.g.. albumin, detergents, 
etc that are used to facilitate optimal protein-protein binding and/or reduce 
5 non-specific or background interactions. Reagents that improve the 
efficiency of the assay, such as protease inhibitors, nuclease inhibitors, ant.- 
microbial agents, etc., may be used. The components are added in any 
order that produces the requisite binding. Incubations are performed at any 
temperature that facilitates optimal activity, typically between 4° and 40°C. 
>0 incubation periods are selected for optimum activity, but may also be 
optimized to facilitate rapid high-throughput screening. Normally, between 
0 1 and 1 hr will be sufficient. In general, a plurality of assay mixtures is run 
in parallel with different agent concentrations to obtain a different*! 
response to these concentrations. Typically, one of these concentrations 
25 serves as a negative control, i.e., at zero concentrction or below the level of 

detection. . 

To perform cell-free ligand screening assays, it may be des.rable to 
immobilize either a 12q23-qter polypeptide, polynucleotide, or fragment to a 
surface to facilitate identification of ligands that bind to these molecules, as 
30 well as to accommodate automation of the assay. For example, a fus,on 
protein comprising a 12q23-q.er polypeptide and an affinity tag can be 
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produced. In one embodiment, a glutathione-S- 

transferase/phosphodiesterase fusion protein comprising a 12q23-qter 
polypeptide is adsorbed onto glutathione sepharose beads (Sigma 
Chemical, St. Louis, MO) or glutathione-derivatized microtiter plates. Cell 

5 lysates (e.g., containing 35 S-labeled polypeptides) are added to the coated 
beads under conditions to allow complex formation (e.g.. at physiological 
conditions for salt and pH). Following incubation, the coated beads are 
washed to remove any unbound polypeptides, and the amount of 
immobilized radiolabel is determined. Alternatively, the complex is 

10 dissociated and the radiolabel present in the supernatant is determined. In 
another approach, the beads are analyzed by SDS-PAGE to identify the 

bound polypeptides. 

Ligand-binding assays can be used to identify agonist or antagonists 
that alter the function or levels of a 12q23-qter polypeptide. Such assays 
15 are designed to detect the interaction of test agents (e.g.. small molecules) 
with 12q23-qter polypeptides, polynucleotides, analogs, or fragments or 
portions thereof. Interactions may be detected by direct measurement of 
binding. Alternatively, interactions may be detected by indirect indicators of 
binding, such as stabilization/destabilization of protein structure, or 
20 activation/inhibition of biological function. Non-limiting examples of useful 
ligand-binding assays are detailed below. 

Ligands that bind to 12q23-qter polypeptides, polynucleotides, 
analogs, or fragments or portions thereof, can be identified using real-time 
Bimolecular Interaction Analysis (BIA; Sjolander et al.. 1991. Anal. Chem. 
25 63:2338-2345; Szabo et al.. 1995. Curr. Opin. Struct. Biol. 5:699-705). BIA- 
based technology (e.g.. BIAcore™; LKB Pharmacia. Sweden) allows study 
of biospecific interactions in real time, without labeling. In BIA, changes in 
the optical phenomenon surface plasmon resonance (SPR) is used 
determine real-time interactions of biological molecules. 
30 Ligands can also be identified by scintillation proximity assays (SPA. 

described in U.S. Patent No. 4.568.649). In a modification of this assay that 
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is currently undergoing development, chaperonins are used to distinguish 
folded and unfolded proteins. A tagged protein is attached to SPA beads, 
and test agents are added. The bead is then subjected to mild denaturing 
conditions (such as. e.g.. heat, exposure to SDS. etc.) and a purified labeled 
5 chaperonin is added. If a test agent binds to a target, the labeled 
chaperonin will not bind; conversely, if no test agent binds, the protein will 
undergo some degree of denaturation and the chaperonin will bind. 

Ligands can also be identified using a binding assay based on 
mitochondrial targeting signals (Hurt et al.. 1985. EMBO J. 4:2061-2068; 
10 Eilers and Schatz. 1986. Nature 322:228-231). In a mitochondrial import 
assay, expression vectors are constructed in which nucleic acids encoding 
particular target proteins are inserted downstream of sequences encoding 
mitochondrial import signals. The chimeric proteins are synthesized and 
tested for their ability to be imported into isolated mitochondria in the 
1 5 absence and presence of test compounds. A test compound that binds to 
the target protein should inhibit its uptake into isolated mitochondria in vrtro. 

The ligand-binding assay described in Fodor et al.. 1991, Saence 
251767-773. which involves testing the binding affinity of test compounds 
for a plurality of defined polymers synthesized on a solid substrate, can also 
20 be used. 

Ligands that bind to 12q23-qter polypeptides or peptides can be 
identified using two-hybrid assays (see. e.g., U.S. Pat. No. 5.283.317; 
Zervos et al., 1993. Cell 72:223-232-. Madura et a... 1993. J. Biol. Chem. 
268 12046-12054; Bartel et al.. 1993, Btofechniques 14:920-924; Iwabuch, 

25 et al 1993, Oncogene 8:1693-1696; and Brent WO 94/10300). The two- 
hybrid system relies on the reconstitution of transcription activation acfvrty 
by association of the DNA-binding and transcription activation domains of . 
transcriptional activator through protein-protein interaction. The yeast GAL4 
transcriptional activator may be used in this way. although other 

30 transcription factors have been used and are well known in the art To 
carryout the two-hybrid assay, the GAL4 DNA-binding doma.n, and the 
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GAL4 transcription activation domain are expressed, separately, as fusions 
to potential interacting polypeptides. 

In one embodiment, the "bait" protein comprises a 12q23-qter 
polypeptide fused to the GAL4 DNA-binding domain. The fish- protein 
5 comprises, for example, a human cDNA library encoded polypeptide fused 
to the GAL4 transcription activation domain. If the two. coexpressed fusion 
proteins interact in the nucleus of a host cell, a reporter gene (e.g.. LacZ) is 
activated to produce a detectable phenotype. The host cells that show two- 
hybrid interactions can be used to isolate the containing plasmids containing 
10 the cDNA library sequences. These plasmids can be analyzed to determine 
the nucleic acid sequence and predicted polypeptide sequence of the 
candidate ligand. Alternatively, methods such as the three-hybrid (Licitra et 
al 1996, Proc. Natl. Acad. Scl. USA 93:12817-12821). and reverse two- 
hybrid (Vidal et al.. 1996, Proc. Natl. Acad. Sci. USA 93:10315-10320) 
15 systems may be used. Commercially available two-hybrid systems such as 
the CLONTECH Matchmaker™ systems and protocols (CLONTECH 
Laboratories, Inc., Palo Alto, CA) may be also be used (see also. A.R. 
Mendelsohn et al.. 1994, Curr. Op. Biotech. 5:482; E.M. Phizicky et al.. 
1995. Microbiological Rev. 59:94; M. Yang et al.. 1995. Nucleic Acids Res. 
23:1152; S. Fields et al., 1994, Trends Genet. 10:286; and U.S. Patent No. 
6,283,173 and 5,468,614). 

Several methods of automated assays have been developed in 
recent years so as to permit screening of tens of thousands of test agents in 
a short period of time. High-throughput screening methods are particularly 
preferred for use with the present invention. The ligand-binding assays 
described herein can be adapted for high-throughput screens, or alternative 
screens may be employed. For example, continuous format high throughput 
screens (CF-HTS) using at least one porous matrix allows the researcher to 
test large numbers of test agents for a wide range of biological or 
30 biochemical activity (see United States Patent No. 5,976,813 to Beutel et 
al.). Moreover. CF-HTS can be used to perform multi-step assays. 
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DIAGNOSTICS 

As discussed herein. 12q23-qter genes are associated with various 
diseases and disorders, including but not limited to, asthma, atopy, obesity, 
male germ cell tumors, histidinemia. growth retardation with deafness and 
5 mental retardation, deficiency of Acyl-CoA dehydrogenase, spinal muscular 
atrophy, Darier disease, cardiomyopathy. Spinocerebellar ataxia-2. 
brachydactyly, Mevalonicaciduria, Hyperimmunoglobulinemia D. Noonan 
syndrome-1. Cardiofaciocutaneous syndrome, spinal muscular atrophy-4. 
tyrosinemia. phenylketonuria. B-cell non-Hodgkin lymphoma. Ulnar- 
10 mammary syndrome. Holt-Oram syndrome. Scapuloperoneal spinal 
muscular atrophy, alcohol intolerance, MODY, diabetes mellitus, non-insulin- 
dependent type 2. diabetes mellitus insulin-dependent (See National Center 
for Biotechnology Information at http://www.ncbi.nlm.nih.gov/omim/), and 
inflammatory bowel disease (B. Wallaert et al., 1995. J- Exp. Med. 
15 182-1897-1904). The present invention therefore provides nucleic acids and 
antibodies that can be useful in diagnosing individuals with disorders 
associated with aberrant 12q23-qter gene expression and/or mutated 
12q23-qter genes. In particular, nucleic acids comprising 12q23-qter SNPs 
can be used to identify chromosomal abnormalities linked to these diseases. 
20 Additionally, antibodies directed against the amino acid variants encoded by 
the 12q23-qter SNPs can be used to identify disease-assoc.ated 
polypeptides. 

AnBbedyJjgsed d bnnnstir. methods : In a further embodiment of the 
present invention, antibodies which specifically bind to a 12q23-qter 

25 polypeptide (e.g., SEQ ID NO:93 to SEQ ID NO:1 55) may be used for the 
diagnosis of conditions or diseases characterized by underexpress,on or 
overexpression of the 12q23-qter polynucleotide or polypeptide, or ,n assays 
,o monitor patients being treated with a 12q23-qter polypeptide, 
polynucleotide, or antibody, or a 1 2q23-qter agonist, antagonist, or inh.b,tor. 

30 The antibodies useful for diagnostic purposes may be prepared ,n the 

same manner as those for use in therapeutic methods, described here,n. 
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Antibodies may be raised to a full-length 12q23-qter polypeptide sequence 
(e.g., SEQ ID NO:93 to SEQ ID NO:155). Alternatively, the antibodies may 
be raised to portions or variants of the 12q23-qter polypeptide. Sudh 
variants include polypeptides encoded by the disclosed 12q23-qter SNPs or 
5 alternate splice variants. In one aspect of the invention, antibodies are 
prepared to bind to a 12q23-qter polypeptide fragment comprising one or 
more domains of the 12q23-qter polypeptide (e.g., transmembrane, 
intracellular, extracellular. SH3. fibronectin III repeat, cysteine-rich. and 
Ser/Thr-XXX-Val domains), as described in detail herein. 
10 Diagnostic assays for a 12q23-qter polypeptide include methods that 

utilize the antibody and a label to detect the protein in biological samples 
(e.g.. human body fluids, cells, tissues, or extracts of cells or tissues). The 
antibodies may be used with or without modification, and may be labeled by 
joining them, either covalently or non-covalently. with a reporter molecule. A 
15 wide variety of reporter molecules that are known in the art may be used, 
several of which are described herein. 

The invention provides methods for detecting disease-associated 
antigenic components in a biological sample, which methods comprise the 
steps of: 1) contacting a sample suspected to contain a disease-associated 
20 antigenic component with an antibody specific for an disease-associated 
antigen, extracellular or intracellular, under conditions in which an antigen- 
antibody complex can form between the antibody and disease-associated 
antigenic components in the sample; and 2) detecting any antigen-antibody 
complex formed in step (1) using any suitable means known in the art. 
25 wherein the detection of a complex indicates the presence of disease- 
associated antigenic components in the sample. It will be understood that 
assays that utilize antibodies directed against altered 12q23-qter amino acid 
sequences (i.e.. epitopes encoded by SNPs, modifications, mutations, or 
variants) are within the scope of the invention. 
30 Many immunoassay formats are known in the art. and the particular 

format used is determined by the desired application. An immunoassay can 
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use. for example, a monoclonal antibody directed against a single disease- 
associated epitope, a combination of monoclonal antibodies directed against 
different epitopes of a single disease-associated antigenic component, 
monoclonal antibodies directed towards epitopes of different disease- 
associated antigens, polyclonal antibodies directed towards the same 
disease-associated antigen, or polyclonal antibodies directed towards 
different disease-associated antigens. Protocols can also, for example, use 
solid supports, or may involve immunoprecipitation. 

In accordance with the present invention, "competitive" (U.S. Pat. 
Nos. 3.654.090 and 3.850.752). "sandwich" (U.S. Pat. No. 4.016.043). and 
"double antibody," or "DASP" assays may be used. Several procedures for 
measuring the amount of a 12q23-qter polypeptide in a sample (e.g.. ELISA. 
RIA and FACS) are known in the art and provide a basis for diagnosing 
altered or abnormal levels of 12q23-qter polypeptide expression. Normal or 
3 standard values for a 12q23-qter polypeptide expression are established by 
incubating biological samples taken from normal subjects, preferably 
human with antibody to a 12q23-qter polypeptide under conditions su.table 
for complex formation. The amount of standard complex formation may be 
quantified by various methods; photometric means are preferred. Levels of 
0 the 12q23-qter polypeptide expressed in the subject sample, negative 
control (normal) sample, and positive control (disease) sample are 
compared with the standard values. Deviation between standard and 
subject values establishes the parameters for diagnosing disease. 

Typically, immunoassays use either a labeled antibody or a labeled 
,5 antigenic component (i.e.. to compete with the antigen in the sample for 
binding to the antibody). A number of fluorescent materials are known and 
can be utilized as labels for antibodies or polypeptides. These include for 
example. Cy3. Cy5. GFP (e.g.. EGFP, DsRed, dEFP. etc. ^ONTECH 
Palo Alto. CA)), Alexa. BOD.PY. fluorescein (e.g., FluorX. DTAF. and F.TC). 
30 rnodamine (e.g.. TRITC), auramine, Texas Red, AMCA blue, and Luofer 
Yellow Antibodies or polypeptides can also be labeled with a rad,oact,ve 
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element or with an enzyme. Preferred isotopes include 3 H. 14 C. 32 P, 35 S. 
36 CI. 51 Cr, 57 Co, 58 Co. 59 Fe. 90 Y. 125 1. 131 I. and 186 Re. 

Preferred enzymes include peroxidase, p-glucuronidase. p-D- 
glucosidase. p-D-galactosidase. urease, glucose oxidase plus peroxidase. 
5 and alkaline phosphatase (see. e.g.. U.S. Pat. Nos. 3.654.090; 3.850.752 
and 4,016.043). Enzymes can be conjugated by reaction with bridging 
molecules such as carbodiimides, diisocyanates, glutaraldehyde, and the 
like. Enzyme labels can be detected visually, or measured by calorimetric. 
spectrophotometric. fiuorospectrophotometric, amperometric, or gasometric 
10 techniques. Other labeling systems, such as avidin/biotin, Tyramide Signal 
Amplification (TSA™), are known in the art, and are commercially available 
(see. e.g.. ABC kit. Vector Laboratories. Inc.. Burlingame, CA; NEN® Life 
Science Products, Inc., Boston, MA). 

Kits suitable for antibody-based diagnostic applications typically 

1 5 include one or more of the following components: 

(1) Antibodies: The antibodies may be pre-labeled; alternatively, the 
antibody may be unlabeled and the ingredients for labeling may be included 
in the kit in separate containers, or a secondary, labeled antibody is 
provided; and 

20 (2) Reaction components: The kit may also contain other suitably 

packaged reagents and materials needed for the particular immunoassay 
protocol, including solid-phase matrices, if applicable, and standards. 

The kits referred to above may include instructions for conducting the 
test. Furthermore, in preferred embodiments, the diagnostic kits are 
25 adaptable to high-throughput and/or automated operation. 

MnH.ir.ar.id-based dja^nostjc methods: The invention prov.des 
methods for detecting altered levels or sequences of 12q23-qter nucleic 
acids (e.g.. SEQ ID NO:1 to SEQ ID NO:92 and SEQ ID N0.156 to SEQ ID 
NO-4687) in a sample, such as in a biological sample, comprising the steps 
30 of 1) contacting a sample suspected to contain a disease-associated 
nucleic acid with one or more disease-associated nucleic acid probes under 
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conditions in which hybrids can form between any of the probes and 
disease-associated nucleic acid in the sample; and 2) detecting any hybrids 
formed in step (1) using any suitable means known in the art. wherein the 
detection of hybrids indicates the presence of the disease-associated 

5 nucleic acid in the sample. Exemplary methods are described in Examples 
9 and 10, herein below. To detect disease-associated nucleic acids present 
in low levels in biological samples, it may be necessary to amplify the 
disease-associated sequences or the hybridization signal as part of the 
diagnostic assay. Techniques for amplification are known to those of skill in 

10 the art. 

The presence of a 12q23-qter polynucleotide sequences can be 
detected by DNA-DNA or DNA-RNA hybridization, or by amplification using 
probes or primers comprising at least a portion of a 12q23-qter 
polynucleotide, or a sequence complementary thereto. In particular, nucleic 
15 acid amplification-based assays can use 12q23-qter oligonucleotides or 
oligomers to detect transformants containing 12q23-qter DNA or RNA. 
Preferably. 12q23-qter nucleic acids useful as probes in diagnostic methods 
include oligonucleotides at least 15 contiguous nucleotides in length, more 
preferably at least 20 contiguous nucleotides in length, and most preferably 
20 at least 25-55 contiguous nucleotides in length, that hybridize specifically 
with 12q23-qter nucleic acids. As non-limiting examples, probes or primers 
useful for diagnostics may comprise any of the 12q23-qter DNA nucleotide 
sequences shown in Tables 8, 9, 11 A, and 11B. 

Several methods can be used to produce specific probes for 12q23- 
25 qter polynucleotides. For example, labeled probes can be produced by 
oligo-labering, nick translation, end-labeling, or PCR amplification using a 
labeled nucleotide. Alternatively. 12q23-qter polynucleotide sequences 
(e.g.. SEQ ID NO:1 to SEQ ID NO:92 and SEQ ID NOH56 to SEQ ID 
NO:4687). or any portions or fragments thereof, may be cloned into a vector 
30 for the production of an mRNA probe. Such vectors are known in the art. 
are commercially available, and may be used to synthesize RNA probes ,n 
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vitro by addition of an appropriate RNA polymerase, such as T7, T3, or 
SP(6) and labeled nucleotides. These procedures may be conducted using 
a va.iety of commercially available kits (e.g., from Amersham-Pharmacia; 
Promega Corp.; and U.S. Biochemical Corp., Cleveland, OH). Suitable 
5 reporter molecules or labels which may be used include radionucleotides. 
enzymes, fluorescent, chemiluminescent, or chromogenic agents, as well as 
substrates, cofactors, inhibitors, magnetic particles, and the like. 

A sample to be analyzed, such as, for example, a tissue sample (e.g., 
hair or buccal cavity) or body fluid sample (e.g., blood or saliva), may be 
10 contacted directly with the nucleic acid probes. Alternatively, the sample 
may be treated to extract the nucleic acids contained therein. It will be 
understood that the particular method used to extract DNA will depend on 
the nature of the biological sample. The resulting nucleic acid from the 
sample may be subjected to gel electrophoresis or other size separation 
15 techniques, or, the nucleic acid sample may be immobilized on an 
appropriate solid matrix without size separation. 

Kits suitable for nucleic acid-based diagnostic applications typically 
include the following components: 

(1) Probe DNA; The probe DNA may be prelabeled; alternatively, the 
20 probe DNA may be unlabeled and the ingredients for labeling may be 

included in the kit in separate containers; and 

(2) Hybridization reagents: The kit may also contain other suitably 
packaged reagents and materials needed for the particular hybridization 
protocol, including solid-phase matrices, if applicable, and standards. 

25 In cases where a disease condition is suspected to involve an 

alteration of a 12q23-qter nucleotide sequence, specific oligonucleotides 
may be constructed and used to assess the level of disease mRNA in cells 
affected or other tissue affected by the disease. For example, PCR can be 
used to test whether a person has a disease-related polymorphism (i.e., 

30 mutation). Specific methods of polymorphism identification are described 
herein, but are not intended to limit the present invention. The detection of 
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polymorphisms in DNA sequences can be accomplished by a variety of 
methods including, but not limited to, RFLP detection based on allele- 
specific restriction-endonuclease cleavage (Kan and Dozy, 1978, lancet 
ii:910-912), hybridization with allele-specific oligonucleotide probes (Wallace 
5 et al., 1978, Nucl Acids Res. 6:3543-3557), including immobilized 
oligonucleotides (Saiki et al., 1969, Proc. Natl. Acad. Sci. USA 86:6230- 
6234) or oligonucleotide arrays (Maskos and Southern, 1993, Nucl. Acids 
Res. 21:2269-2270), allele-specific PCR (Newton et al.. 1989, Nucl. Acids 
Res. 17:2503-2516), mismatch-repair detection (MRD) (Faham and Cox, 
10 1995, Genome Res. 5:474-482), binding of MutS protein (Wagner et al., 
1995, Nucl. Acids Res. 23:3944-3948), denaturing-gradient gel 
electrophoresis (DGGE) (Fisher and Lerman et al., 1983, Proc. Natl. Acad. 
Sci. USA. 80:1579-1583), single-strand-conformation-polymorphism 
detection (Orita et al., 1983, Genomics 5:874-879), RNAase cleavage at 
15 mismatched base-pairs (Myers et al., 1985, Science 230:1242), chemical 
(Cotton et al., 1988, Proc. Natl. Acad. Sci. USA 8:4397-4401) or enzymatic 
(Youil et al., 1995, Proc. Natl. Acad. Sci. USA 92:87-91) cleavage of 
heteroduplex DNA, methods based on allele specific primer extension 
(Syvanen et al.. 1990. Genomics 8:684-692), genetic bit analysis (GBA) 
20 Nikiforov et al., 1994, Nucl. Acids 22:4167-4175), the oligonucleotide-ligation 
assay (OLA) (Landegren et al., 1988, Science 241:1077), the allele-specific 
ligation chain reaction (LCR) (Barrany, 1991, Proc. Natl. Acad. Sci. USA 
88:189-193), gap-LCR (Abravaya et al., 1995, Nucl. Acids Res. 23:675-682), 
radioactive and/or fluorescent DNA sequencing using standard procedures 
25 well known in the art, and peptide nucleic acid (PNA) assays (Orum et al.. 
1993, Nucl. Acids Res. 21:5332-5356). 

For PCR analysis, 12q23-qter oligonucleotides may be chemically 
synthesized, generated enzymatically, or produced from a recombinant 
source. Oligomers will preferably comprise two nucleotide sequences, one 
30 with a sense orientation (5' -> 3') and another with an antisense orientation 
(3' 5"), employed under optimized conditions for identification of a specific 
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gene or condition. The same two oligomers, nested sets of oligomers, or 
even a degenerate pool of oligomers may be employed under less stringent 
conditions for detection and/or quantification of closely related DNA or RNA 
sequences. 

5 In accordance with PCR analysis, two oligonucleotides are 

synthesized by standard methods or are obtained from a commercial 
supplier of custom-made oligonucleotides. The length and base 
composition are determined by standard criteria using the Oligo 4.0 primer 
Picking program (W. Rychlik, 1992; available from Molecular Biology 
10 Insights. Inc.. Cascade, CO). One of the oligonucleotides is designed so 
that it will hybridize only to the disease gene DNA under the PCR conditions 
used. The other oligonucleotide is designed to hybridize a segment of 
genomic DNA such that amplification of DNA using these oligonucleotide 
primers produces a conveniently identified DNA fragment. Samples may be 
15 obtained from hair follicles, whole blood, or the buccal cavity. The DNA 
fragment generated by this procedure is sequenced by standard techniques. 

In one particular aspect, 12q23-qter oligonucleotides can be used to 
perform Genetic Bit Analysis (GBA) of 12q23-qter genes in accordance with 
published methods (T.T. Nikiforov et al.. 1994, Nucleic Acids Res. 
20 22(20):4167-75; T.T. Nikiforov TT et al.. 1994, PCR Methods Appi. 3(5):285- 
91). In PCR-based GBA, specific fragments of genomic DNA containing the 
polymorphic site(s) are first amplified by PCR using one unmodified and one 
phosphorothioate-modified primer. The double-stranded PCR product is 
rendered single-stranded and then hybridized to immobilized oligonucleotide 
25 primer in wells of a multi-well plate. The primer is designed to anneal 
immediately adjacent to the polymorphic site of interest. The 3' end of the 
primer is extended using a mixture of individually labeled dideoxynucleoside 
triphosphates. The label on the extended base is then determined. 
Preferably. GBA is performed using semi-automated ELISA or biochip 
30 formats (see, e.g., S.R. Head et al.. 1997, Nucleic Acids Res. 25(24):5065- 
71; T.T. Nikiforov et al.. 1994, Nucleic Acids Res. 22(20):41 67-75). 
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Other amplification techniques besides PCR may be used as 
alternatives, such as ligation-mediated PCR or techniques involving Q-beta 
replicase (Cahill et al.. 1991, Clin. Chem., 37(9)^482-5). Products of 
amplification can be detected by agarose gel electrophoresis, quantitative 
hybridization, or equivalent techniques for nucleic acid detection known to 
one skilled in the art of molecular biology (Sambrook et al.. 1989). Other 
alterations in the disease gene may be diagnosed by the same type of 
amplification-detection procedures, by using oligonucleotides designed to 
contain and specifically identify those alterations. 

In accordance with the present invention. 12q23-qter polynucleotides 
may also be used to detect and quantify levels of 12q23-qter mRNA in 
biological samples in which altered expression of 12q23-qter polynucleotide 
may be correlated with disease. These diagnostic assays may be used to 
distinguish between the absence, presence, increase, and decrease of 
12q23-qter mRNA levels, and to monitor regulation of 12q23-qter 
polynucleotide levels during therapeutic treatment or intervention. For 
example, 12q23-qter polynucleotide sequences, or fragments, or 
complementary sequences thereof, can be used in Southern or Northern 
analysis, dot blot, or other membrane-based technologies; in PCR 
technologies; or in dip stick, pin, ELISA or biochip assays utilizing fluids or 
tissues from patient biopsies to detect the status of. e.g.. levels or 
overexpression of 12q23-qter genes, or to detect altered 12q23-qter gene 
expression. Such qualitative or quantitative methods are well known .n the 
art (G.H. Keller and M.M. Manak. 1993. DNA Probes, 2 nd Ed. Macm.ilan 
Publishers Ltd., England; D.W. Dieffenbach and G. S. Dveksler. 1995. PCR 
Primer A Laboratory Manual. Cold Spring Harbor Press, Plainview, NY; 
B.D. Names and S.J. Higgins. 1985. Gene Probes 1. 2, IRL Press at Oxford 
University Press, Oxford, England). 

Methods suitable for quantifying the expression of 12q23-qter genes 
include radiolabeling or biotinylating nucleotides, co-amplification of a 
control nucleic acid, and standard curves onto which the experimental 
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results are interpolated (P.C. Melby et al.. 1993. J. Immunol. Methods 
159 235-244; and C. Duplaa et al., 1993, Anal. B/ochem. 212(1 ):229-36.). 
The speed of quantifying multiple samples may be accelerated by runnmg 
the assay in an ELISA format where the oligomer of interest is presented ,n 
various dilutions and a spectrophotometry or colorimetric response g.ves 

rapid quantification. 

in accordance with these methods, the specificity of the probe, i.e., 
whether it is made from a highly specific region (e.g.. at least 8 to 10 or 12 
or 15 contiguous nucleotides in the 5' regulatory region), or a less specific 
, region (e.g.. especially in the 3' coding region), and the stringency of the 
hybridization or amplification (e.g.. high, moderate, or low) will determ.ne 
whether the probe identifies naturally occurring sequences encoding the 
12q23-qter polypeptide, or alleles, SNPs, mutants, or related sequences. 

in a particular aspect, a 12q23-qter nucleic acid sequence (e.g.. SEQ 
5 ,D NO-1 to SEQ ID NO:92 and SEQ ID NO:156 to SEQ ID NO:4687). or a 
sequence complementary thereto, or fragment thereof, may be useful ,n 
assays that detect 12q23-qter-related diseases such as asthma. A 12q23- 
qter polynucleotide can be labeled by standard methods, and added to a 
biological sampie from a subject under conditions suitable for the format™ 
, 0 of hybridization complexes. After a suitable incubation period, the sample 
can be washed and the signal is quantified and compared with a standard 
value. If the amount of signal in the test sample is significantly altered f rom 
that of a comparable negative control (norma,) sample, the altered levels o 
a 12q23-qter nucleotide sequence can be correlated with the presence of 
25 the associated disease. Such assays may also be used to evaluate the 
efficacy of a particular prophylactic or therapeutic regimen in animal studies, 
in clinical trials, or for an individual patient. 

To provide a basis for the diagnosis of a disease associated wAh 
aitered expression of a 12q23-qter gene, a normal or standard profile or 
30 expression is estabiished. This may be accomplished by — 
bio ogical samples taken from norma, subjects, either an.ma, or human. w*h 
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to understand the genetic basis of a disease, to diagnose disease, and to 
develop and monitor the activities of therapeutic or prophylact,c agents. 
Preparation and use of microarrays have baer. described in WO 95/1 1995 to 
Chee et al ■ D J Lockhart et al„ 1996, Nature Biofecr.no/ogy 14:1675-1680; 
5 M Schena et a,.. 1996. Pmc. NsU. Acad. SC. USA 93:10614-10619; U.S. 
Patent No. 6.015.702 to P. Ual et al; J. WoHey et al.. 2000. M/croarray 
Biochip Technology, M. Schena. ad.. Biotechniques Book. Natick. MA, pp. 
65-86- Y H Rogers et al.. 1999, Anal. Biochem. 266(1 ):23-30; S.R. Head et 
al., 1999. Mo/. Cell. Probes. 13(2):81-7; S.J. Watson et al.. 2000. EM. 

10 Psychiatry 48(12):1 147-56. 

in one application of the present invention, microarrays containing 
arrays of 12q23-qter polynucleotide sequences can be used to measure the 
expression levels of 12q23-qter nucleic acids in an individual. In particular. 
t0 diagnose an individual with a 12q23-qter-related condition or disease, a 
1 5 sample from a human or animal (containing nucleic acids, e.g. mRNA) can 
be used as a probe on a biochip containing an array of 12q23-qter 
polynucleotides (e.g., DNA) in decreasing concentrations (e.g.. 1 ng 0.1 ng 
.01 ng. etc.,. The test sample can be compared to ^^ *T« 
and normal samples. Biochips can also be used to ««** War 
20 mutations or polymorphisms in a population, including but no, hrtrt £ 
deletions, insertions, and mismatches. For example, mutations can ^>e 
identffied by: 1) placing 12q23-q.er polynucleotides of this — ^ 
biochip; 2) taking a test sample (containing, e.g., mRNA) and adding he 
ample to the biochip; 3, determining if the test sampies 
26 12q23-q.er polynucleotides attached to the chip under various hybnd,za,,on 
" logons (sel, e.g.. V.R. Chechetkin e, al., 2000, , B/omo, £ 
18(1) 83-101). Alternatively microarray sequencing can be performed (see. 
eg E P. Diamandis, 2000, Clin. Chem. 46(10): 1523-5). 

-„„inn- In another application of this invention, 
^^snrTiB mapping. In anotner «w 

30 12 o23-^e7n^.eic acid sequences (e.g.. SEQ ID NO:1 to SEQ ID N0.92 

and SEQ ID NO: 156 to SEO ID N0.4687). or complementary sequences, or 
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fragments thereof, can be used as probes to map genomic sequences. The 
sequences may be mapped to a particular chromosome, to a specific region 
of a chromosome, or to human artificial chromosome constructions (HACsj, 
yeast artificial chromosomes (YACs), bacterial artificial chromosomes 
(BACs), bacterial PI constructions, or single chromosome cDNA libraries 
(see, e.g.. CM. Price. 1993, Blood Rev., 7:127-134; B.J. Trask, 1991. 
Trends Genet. 7:149-154). 

In another of its aspects, the invention relates to a diagnostic kit for 
detecting a 12q23-qter polynucleotide or polypeptide as it relates to a 
disease or susceptibility to a disease, particularly asthma. Also related is a 
diagnostic kit that can be used to detect or assess asthma conditions. Such 
kits comprise one or more of the following: 

(a) a 12q23-qter polynucleotide, preferably the nucleotide sequence of 
any one of SEQ ID NO:1 to SEQ ID NO:92 and SEQ ID NO:156 to SEQ ID 

1 5 NO:4687, or a fragment thereof; or 

(b) a nucleotide sequence complementary to that of (a); or 

(c) a 12q23-qter polypeptide, preferably the polypeptide of any one of 
SEQ ID NO-.93 to SEQ ID NO:155, or a fragment thereof; or 

(d) an antibody to a 12q23-qter polypeptide, preferably to the 
polypeptide of any one of SEQ ID NO:93 to SEQ ID NO:155. or an antibody 
bindable fragment thereof. It will be appreciated that in any such kits. (a), 
(b) (c) or (d) may comprise a substantial component and that instructions 
for use can be included. The kits may also contain peripheral reagents such 

as buffers, stabilizers, etc. 

The present invention also includes a test kit for genetic screening 
that can be utilized to identify mutations in 12q23-qter genes. By identifying 
patients with mutated 12q23-qter DNA and comparing the mutation to a 
database that contains known mutations in 12q23-qter and a part.cu.ar 
condition or disease, identification and/or confirmation of, a particular 
30 condition or disease can be made. Accordingly, such a kit would compose a 
PCR-based test that would involve transcribing the patients mRNA with a 
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specific primer, and amplifying the resulting cDNA using another set of 
primers. The amplified product would be detectable by gel electrophoresis 
and could be compared with known standards for 12q23-qter genes. 
Preferably, this kit would utilize a patient's blood, serum, or saliva sample, 
and the DNA would be extracted using standard techniques. Primers 
flanking a known mutation would then be used to amplify a fragment of a 
12q23-qter gene. The amplified piece would then be sequenced to 
determine the presence of a mutation. 

Genomic Screening : Polymorphic genetic markers linked to a 12q23- 
qter gene can be used to predict susceptibility to the diseases genetically 
linked to that chromosomal region. Similarly, the identification of 
polymorphic genetic markers within 12q23-qter genes will allow the 
identification of specific allelic variants that are in linkage disequilibnum w.th 
other genetic lesions that affect one of the disease states discussed herein 
including respiratory disorders, obesity, and inflammatory bowel disease. 
SSCP (see below) allows the identification of polymorphisms within the 
genomic and coding region of the disclosed genes. 

The present invention provides sequences for primers that can be 
used identify exons that contain SNPs, as well as sequences for primers 
that can be used to identify the sequence changes of the SNPs. In 
particular, Table 10 shows polymorphic genetic markers within the 
chromosome 12q23-qter genes, which can be used to identify specific allelic 
variants that are in linkage disequilibrium with other genetic lesions that 
affect one of the disease states discussed herein, including respiratory 
, disorders, obesity, and inflammatory bowel disease. Such markers can be 
used in conjunction with SSCP to identify polymorphisms within the genomic 
and coding region of the disclosed gene. Table 8 shows primers that can be 
used to identify exons containing SNPs. Table 9 shows primers that can be 
used to identify the sequence changes of the SNPs. 
0 This information can be used to identify additional SNPs m 

accordance with the methods disclosed herein. Suitable methods for 
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genomic screening have also been described by, e.g.. Sheffield et al.. 1995. 
Genef 4:1837-1844; LeBlanc-Straceski et al., 1994. Genomics 19.341-9. 
Chen et al., 1995. Genomics 25:1-8. In employing these methods, the 
disclosed reagents can be used to predict the risk for disease (e.g.. 
5 respiratory disorders, obesity, and inflammatory bowel disease) in a 
population or individual. 

THERAPEUTICS 

As discussed herein, 12q23-qter genes are associated with vanous 
diseases and disorders, including but not limited to, asthma, atopy, obesity. 
10 male germ cell tumors, histidinemia, growth retardation with deafness and 
mental retardation, deficiency of Acyl-CoA dehydrogenase, spinal muscular 
atrophy. Darier disease, cardiomyopathy. Spinocerebellar atax,a-2, 
brachydactyly. Mevalonicaciduria, Hyperimmunoglobulinemia D, Noonan 
syndrome-1, Cardiofaciocutaneous syndrome, spinal muscular atrophy-4. 
15 tyrosinemia, phenylketonuria, B-cell non-Hodgkin lymphoma, Ulnar- 
mammary syndrome, Holt-Oram syndrome, Scapuloperoneal spmal 
muscular atrophy, alcohol intolerance, MODY, diabetes me.litus. non-.nsul.n- 
dependent type 2. diabetes mellitus insulin-dependent (See National Center 
for Biotechnology Information at hnp://www.ncbi.nlm.nih.gov/om,m/). and 
20 inflammato^ bowe. disease (B. Wallaert et al.. 1995. J. Exp. Med. 
182-1897-1904). The present invention therefore provides composes 
(eg pharmaceutical compositions) comprising 12q23-qter nucleic acids, 
polypeptides, antibodies, ligands, or variants, portions, or fragments thereof 
that can be useful in treating individuals with these disorders. Also prowled 
25 are methods employing 12q23-qter nucleic acids, polypeptides, ant.bod.es, 
Koands. or variants, portions, or fragments thereof to identify drug 
oandidates that can be used to prevent, treat, or ameliorate such disorders^ 

o„. n greening »nd design : The present invention provides methods 
of screening for drugs using a 12q23-qter polypeptide (e.g.. SEQ ID NO:93 
30 to SEQ ID NO:155). or portion thereof, in competitive bind.ng assays, 
according to methods well-known in the art. For example, compete drug 
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screening assays can be employed using neutralizing antibodies capable of 
specifically binding a 12q23-qter polypeptide compete with a test compound 
for binding to the 12q23-qter polypeptide or fragments thereof. 

The present invention further provides methods of rational drug 
design employing a 12q23-qter polypeptide, antibody, or portion or 
functional equivalent thereof. The goal of rational drug design is to produce 
structural analogs of biologically active polypeptides of interest or of small 
molecules with which they interact (e.g.. agonists, antagonists, or inhibitors), 
in turn, these analogs can be used to fashion drugs which are. for example, 
more active or stable forms of the polypeptide, or which, e.g.. enhance or 
interfere with the function of the polypeptide in vivo (see. e.g.. Hodgson. 
1991 Bto/Technotogy, 9:19-21). An example of rational drug design ,s the 
development of HIV protease inhibitors (Erickson et al.. 1990. Science, 
249:527-533). 

in one approach, one first determines the three-dimensional structure 
of a protein of interest or, for example, of a 12q23-qter polypeptide or l,gand 
oomp.ex, by x-ray crystallography, computer modeling, or a comb.nafion 
.hereof Useful information regarding the structure of a polypeptide can also 
be gained by computer modeling based on the stmcture of homologous 
proteins. In addition. 12q23-q.er polypeptides (e.g.. SEQ ID NO:93 tc ,860 
ID NO-155), or portions thereof, can be analyzed by an alanine scan (Wells. 
1991, Methods in Enzymol., 202:390^11). In this technique, each am.no 
acid residue in a 12q23-qter polypeptide is replaced by alanine, and Is 
effect on the activity of the polypeptide is determined. 
5 ,n another approach, an antibody specffic to a 12q23-qter 

polypeptide can be isolated, selected by a functional assay, and then 
analyzed to solve its crystal structure. In principle, this approach can y,e,d a 
pnarmacore upon which subsequent drug design can be . based. 
Alternatively, it is possible to bypass protein crystallography altogether by 
,0 generating anti-idiotypic antibodies (anti-ids, to a 

pharmacologically active antibody. As a mirror image of a mirror image, the 
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binding site of the anti-ids is predicted to be an analog of the corresponding 
12q23-qter polypeptide. The anti-id can then be used to identify and isolate 
peptides from banks of chemically or biologically produced banks of 
peptides. Selected peptides can subsequently be used as pharmacores. 
5 Non-limiting examples of methods and computer tools for drug design 

are described in R. Cramer et al.. 1974, J. Med. Chem. 17:533; H. Kubinyi 
(ed) 1993, 3D QSAR in Drug Design, Theory, Methods, and Applications, 
ESCOM, Leiden, Holland; P. Dean (ed) 1995, Molecular Similarity in Drug 
Design, K. Kim "Comparative molecular field analysis (ComFA)" p. 291-324. 
10 Chapman & Hill, London, UK; Y. et at.. 1993, J. Comp.-Aid. Mol. Des. 7:83- 

102; G. Lauri and P.A. Bartlett. 1994, J. Comp.-Aid. Mol. Des. 8:51-66; P.J. 

Gane and P.M. Dean, 2000. Curr. Opin. Struct. Biol. 10(4):401-4; H.O. Kim 

and M. Kahn, 2000, Comb. Chem. High Throughput Screen. 3(3):167-83; 

G.K. Farber, 1999, Pharmacol Ther. 84(3):327-32; and H. van de 
15 Waterbeemd (ed) 1996, Structure-Property Correlations in Drug Research, 

Academic Press, San Diego, CA. 

In another aspect of the present invention, cells and animals that 
carry a 12q23-qter gene or an analog thereof can be used as model 
systems to study and test for substances that have potential as therapeutic 
agents. After a test agent is administered to animals or applied to the cells, 
the phenotype of the animals/cells can be determined. 

In accordance with these methods, one may design drugs that result 
in for example, altered 12q23-qter polypeptide activity or stability. Such 
drugs may act as inhibitors, agonists, or antagonists of a 12q23-qter 
polypeptide. By virtue of the availability of cloned 12q23-qter gene 
sequences, sufficient amounts of the 12q23-qter polypeptide may be 
produced to perform such analytical studies as x-ray crystallography. In 
addition, the knowledge of the 12q23-qter polypeptide sequence w..l gu.de 
those employing computer-modeling techniques in place of. or in add.t,on to 
30 x-ray crystallography. 
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Pharmaceutical compositions : The present invention contemplates 
compositions comprising a 12q23-qter polynucleotide (e.g.. SEQ ID NO:1 to 
SEQ ID NO:92 and SEQ ID N0.156 to SEQ ID NO:4687), polypeptide (e.g., 
SEQ ID NO:93 to SEQ ID NO:155), antibody, ligand (e.g.. agonist. 
5 antagonist, or inhibitor), or fragments, variants, or analogs thereof, and a 
physiologically acceptable carrier, excipient, or diluent as described in detail 
herein. The present invention further contemplates pharmaceutical 
compositions useful in practicing the therapeutic methods of this invention. 
Preferably, a pharmaceutical composition includes, in admixture, a 
10 pharmaceutical^ acceptable excipient (carrier) and one or more of a 12q23- 
qter polypeptide, polynucleotide, ligand. antibody, or fragment, portion, or 
variant thereof, as described herein, as an active ingredient. The 
preparation of pharmaceutical compositions that contain 12q23-qter 
molecules as active ingredients is well understood in the art. Typically, such 
15 compositions are prepared as injectables, either as liquid solutions or 
suspensions, however, solid forms suitable for solution in, or suspension in, 
liquid prior to injection can also be prepared. The preparation can also be 
emulsified. The active therapeutic ingredient is often mixed with excipients 
that are pharmaceutical^ acceptable and compatible with the active 
20 ingredient. Suitable excipients are, for example, water, saline, dextrose, 
glycerol, ethanol. or the like and combinations thereof. In addition, if 
desired, the composition can contain minor amounts of auxiliary substances 
such as wetting or emulsifying agents, pH-buffering agents, which enhance 
the effectiveness of the active ingredient. 
25 A 12q23-qter polypeptide, polynucleotide, ligand, antibody, or 

fragment, portion, or variant thereof can be formulated into the 
pharmaceutical composition as neutralized physiologically acceptable salt 
forms. Suitable salts include the acid addition salts (i.e.. formed with the 
free amino groups of the polypeptide or antibody molecule) and which are 
formed with inorganic acids such as, for example, hydrochloric or 
phosphoric acids, or such organic acids as acetic, oxalic, tartaric, mandelic, 
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and the like. Salts formed from the free carboxyl groups can also be derived 
from inorganic bases such as, for example, sodium, potassium, ammonium, 
calcium, or ferric hydroxides, and such organic bases as isopropylamine, 
trimethylamine, 2-ethytamino ethanol, histidine, procaine, and the like. 
5 The pharmaceutical compositions can be administered systemically 

by oral or parenteral routes. Non-limiting parenteral routes of administration 
include subcutaneous, intramuscular, intraperitoneal, intravenous, 
transdermal, inhalation, intranasal, intra-arterial, intrathecal, enteral, 
sublingual, or rectal. Intravenous administration, for example, can be 
10 performed by injection of a unit dose. The term "unit dose" when used in 
reference to a pharmaceutical composition of the present invention refers to 
physically discrete units suitable as unitary dosage for humans, each unit 
containing a predetermined quantity of active material calculated to produce 
the desired therapeutic effect in association with the required diluent; i.e.. 

1 5 carrier, or vehicle. 

In one particular embodiment of the present invention, the disclosed 
pharmaceutical compositions are administered via mucoactive aerosol 
therapy (see, e.g., M. Fuloria and B.K. Rubin, 2000, Respir. Care 45:868- 
873; I. Gonda, 2000, J. Pharm. Sci. 89:940-945; R. Dhand, 2000. Curr. 
20 Opin. Pulm. Med. 6(1):59-70; B.K. Rubin, 2000, Respir. Care 45(6):684-94; 
S. Suarezand A.J. Hickey, 2000, Respir. Care. 45(6):652-66). 

Pharmaceutical compositions are administered in a manner 
compatible with the dosage formulation, and in a therapeutically effective 
amount. The quantity to be administered depends on the subject to be 
25 treated, capacity of the subject's immune system to utilize the active 
ingredient, and degree of modulation of 12q23-qter gene activity desired. 
Precise amounts of active ingredient required to be administered depend on 
the judgment of the practitioner and are specific for each individual. 
However, suitable dosages may range from about 0.1 to 20, preferably 
30 about 0.5 to about 10, and more preferably one to several, milligrams of 
active ingredient per kilogram body weight of individual per day and depend 
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on the route of administration. Suitable regimes for initial administration and 
booster shots are also variable, but are typified by an initial administration 
followed by repeated doses at one or more hour intervals by a subsequent 
injection or other administration. Alternatively, continuous intravenous 
5 infusions sufficient to maintain concentrations of 10 nM to 10 >iM in the 
blood are contemplated. An exemplary pharmaceutical formulation 
comprises: 12q23-qter antagonist or inhibitor (5.0 mg/ml); sodium bisulfite 
USP (3.2 mg/ml); disodium edetate USP (0.1 mg/ml); and water for injection 
q.s.a.d. (1.0 ml). As used herein, "pg" means picogram. "ng" means 
10 nanogram. >g" means microgram, "mg" means milligram. V" means 
microliter, "ml" means milliliter, and "I" means L. 

For further guidance in preparing pharmaceutical formulations, see. 
e.g.. Oilman et al. (eds). 1990, Goodman and Gilman's: The 
Pharmacological Basis of Therapeutics, 8th ed.. Pergamon Press; and 
Remington's Pharmaceutical Sciences, 17th ed.. 1990, Mack Publishing 
Co.. Easton. PA; Avis et al. (eds), 1993, Pharmaceutical Dosage Forms: 
Parenteral Medications, Dekker, New York; Lieberman et al. (eds). 1990. 
Pharmaceutical Dosage Forms: Disperse Systems, Dekker, New York. 

In yet another aspect of this invention, antibodies that specifically 
20 react with a 12q23-qter polypeptide or peptides derived therefrom can be 
used as therapeutics. In particular, such antibodies can be used to block 
the activity of a 12q23-qter polypeptide. Antibodies or fragments thereof can 
be formulated as pharmaceutical compositions and administered to a 
subject. It is noted that antibody-based therapeutics produced from non- 
25 human sources can cause an undesired immune response in human 
subjects. To minimize this problem, chimeric antibody derivatives can be 
produced. Chimeric antibodies combine a non-human animal vanable 
region with a human constant region. Chimeric antibodies can be 
constructed according to methods known in the art (see Mornson et al.. 
30 1985 Proc. Natl. Acad. Sci. USA 81:6851; Takeda et a... 1985. Nature 
314452- U.S. Patent No. 4,816,567 of Cabilly et al.; U.S. Patent No. 
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4,816,397 of Boss et al.; European Patent Publication EP 171496; EP 
0173494; United Kingdom Patent GB 2177096B). 

In addition, antibodies can be further -humanized" by any of the 
techniques known in the art, (e.g., Teng et al.. 1983. Proc. Natl. Acad. Sci. 
5 USA 80:7308-7312; Kozbor et al., 1983. Immunology Today 4: 7279; Olsson 
et al.. 1982. Meth. Enzymol. 92:3-16; International Patent Application 
WO92/06193; EP 0239400). Humanized antibodies can also be obtained 
from commercial sources (e.g.. Scotgen Limited, Middlesex. England). 
Immunotherapy with a humanized antibody may result in increased long- 
10 term effectiveness for the treatment of chronic disease situations or 
situations requiring repeated antibody treatments. 

Pharmacogenetics : The 12q23-qter polynucleotides (e.g.. SEQ ID 
NO:1 to SEQ ID NO:92 and SEQ ID NO:156 to SEQ ID NO:4687) and 
polypeptides (e.g.. SEQ ID NO:93 to SEQ ID NO:155) of the invention are 
15 also useful in pharmacogenetic analysis (i.e.. the study of the relationship 
between an individual's genotype and that individual's response to a 
therapeutic composition or drug). See, e.g., M. Eichelbaum, 1996. Clin. 
Exp. Pharmacol. Physiol. 23(1 0-1 1):983-985, and M.W. Under, 1997. Clin. 
Chem. 43(2):254-266. The genotype of the individual can determine the 
20 way a therapeutic acts on the body or the way the body metabolizes the 
therapeutic. Further, the activity of drug metabolizing enzymes affects both 
the intensity and duration of therapeutic activity. Differences in the activity 
or metabolism of therapeutics can lead to severe toxicity or therapeutic 
failure. Accordingly, a physician or clinician may consider applying 
25 knowledge obtained in relevant pharmacogenetic studies in determining 
whether to administer a 12q23-qter polypeptide, polynucleotide, analog, 
antagonist, inhibitor, or modulator, as well as tailoring the dosage and/or 
therapeutic or prophylactic treatment regimen. 

In general, two types of pharmacogenetic conditions can be 
30 differentiated. Genetic conditions can be due to a single factor that alters 
the way the drug act on the body (altered drug action), or a factor that alters 
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the way the body metabolizes the drug (altered drug metabolism). These 
conditions can occur either as rare genetic defects or as naturally-occurring 
polymorphisms. For example, glucose-6-phosphate dehydrogenase 
deficiency (G6PD) is a common inherited enzymopathy which results in 
5 haemolysis after ingestion of oxidant drugs (anti-malarials, sulfonamides, 
analgesics, nitrofurans) and consumption of fava beans. 

The discovery of genetic polymorphisms of drug metabolizing 
enzymes (e.g.. N-acetyitransferase 2 (NAT 2) and cytochrome P450 
enzymes CYP2D6 and CYP2C19) has provided an explanation as to why 
10 some patients do not obtain the expected drug effects or show exaggerated 
drug response and serious toxicity after taking the standard and safe dose 
of a drug. These polymorphisms are expressed in two phenotypes in the 
population, the extensive metabolizer (EM) and poor metabolizer (PM). The 
prevalence of PM is different among different populations. The gene coding 
15 for CYP2D6 is highly polymorphic and several mutations have been 
identified in PM. which all lead to the absence of functional CYP2D6. Poor 
metabolizers quite frequently experience exaggerated drug response and 
side effects when they receive standard doses. If a metabolite is the active 
therapeutic moiety. PM show no therapeutic response. This has been 
20 demonstrated for the analgesic effect of codeine mediated by its CYP2D6- 
formed metabolite morphine. At the other extreme, ultra-rapid metabolizers 
fail to respond to standard doses. Recent studies have determined that 
ultra-rapid metabolism is attributable to CYP2D6 gene amplification. 

By analogy, genetic polymorphism or mutation may lead to allelic 
25 variants of 12q23-qtc: genes in the population which have different levels of 
activity. The 12q23-qter polypeptides or polynucleotides thereby allow a 
clinician to ascertain a genetic predisposition that can affect treatment 
modality. In addition, genetic mutation or variants at other genes may 
potentiate or diminish the activity of 1 2q23-qter-targeted drugs. Thus, in a 
30 12q23-qter gene-based treatment, a polymorphism or mutation may give 
rise to individuals that are more or less responsive to treatment. 



95 

Accordingly, dosage would necessarily be modified to maximize the 
therapeutic effect within a given population containing the polymorphism. 
As an alternative to genotyping, specific polymorphic polypeptides or 
polynucleotides can be identified. 
5 To identify genes that modify 12q23-qter-targeted drug response, 

several pharmacogenetic methods can be used. One pharmacogenomics 
approach, "genome-wide association", relies primarily on a high-resolution 
map of the human genome. This high-resolution map shows previously 
identified gene-related markers (e.g., a "bi-allelic" gene marker map which 
10 consists of 60,000-100,000 polymorphic or variable sites on the human 
genome, each of which has two variants). A high-resolution genetic map 
can then be compared to a map of the genome of each of a statistically 
significant number of patients taking part in a Phase ll/lll drug trial to identify 
markers associated with a particular observed drug response or side effect. 
15 • Alternatively, a high-resolution map can be generated from a combination of 
some ten million known single nucleotide polymorphisms (SNPs) in the 
human genome. Given a genetic map based on the occurrence of such 
SNPs, individuals can be grouped into genetic categories depending on a 
particular pattern of SNPs in their individual genome. In this way, treatment 
20 regimens can be tailored to groups of genetically similar individuals, taking 
into account traits that may be common among such genetically similar 
individuals (see, e.g., D.R. Pfost et al., 2000, Trends Biotechnol. 18(8):334- 

8). 

As another example, the "candidate gene approach", can be used. 
25 According to this method, if a gene that encodes a drug target is known, all 
common variants of that gene can be fairly easily identified in the population 
and it can be determined if having one version of the gene versus another is 
associated with a particular drug response. 

As yet another example, a "gene expression profiling approach", can 
be used. This method involves testing the gene expression of an animal 
treated with a drug (e.g., a 12q23-qter polypeptide, polynucleotide, analog, 
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or modulator) to determine whether gene pathways related to toxicity have 

been turned on. . 

information obtained from one of the approaches described herein 
can be used to establish a pharmacogenetic profile, which can be used to 
5 determine appropriate dosage and treatment regimens for prophylacfic or 
therapeutic treatment an individual. A pharmacogenetic profile, when 
applied to dosing or drug selection, can be used to avoid adverse reactions 
or therapeutic failure and thus enhance therapeutic or prophylactic efficiency 
when treating a subject with a 12q23-qter polypeptide, polynucleotide. 
1 o analog, antagonist, inhibitor, or modulator. 

The 12q23-qter polypeptides or polynucleotides of the invention are 
also useful for monitoring therapeutic effects during clinical trials and other 
treatment. Thus, the therapeutic effectiveness of an agent that is designed 
to increase or decrease gene expression, polypeptide levels, or activity can 
15 be monitored over the course of treatment using the 12q23-qter 
compositions or modulators. For example, monitoring can be performed by: 
1) obtaining a pre-administration sample from a subject pnor to 
administration of the agent; 2) detecting the level of expression or activity of 
the protein in the pre-administration sample; 3) obtaining one or more post- 
20 administration samples from the subject; 4) detecting the level of expression 
or activity of the polypeptide in the post-administration samples; 5) 
comparing the level of expression or activity of the polypeptide in the pre- 
administration sample with the polypeptide in the post-administration sample 
or samples; and 6) increasing or decreasing the administration of the agent 

25 to the subject accordingly. 

GeneTheraox The 12q23-qter polynucleotides (e.g.. SEQ ID NO.1 
«o SEQ ID NO:92 and SEQ ID NO:156 to SEQ ID NO:4687) and 
polypeptides (e.g., SEQ ID NO:93 to SEQ ID NO:155) of the invention also 
find use as gene therapy reagents. In recent years, significant technological 

30 advances have been made in the area of gene therapy for •^"""f 
acquired diseases (Kay e, a,., 1997, Proc. Nat,. Acad. So, USA. 94:12744- 
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12746). Gene therapy can be defined as the transfer of DNA for therapeutic 
purposes. Improvement in gene transfer methods has allowed for 
development of gene therapy protocols for the treatment of diverse types of 
diseases. Gene therapy has also taken advantage of recent advances In 
5 the identification of new therapeutic genes, improvement in both viral and 
non-viral gene delivery systems, better understanding of gene regulation, 
and improvement in cell isolation and transplantation. Gene therapy would 
be carried out according to generally accepted methods as described by. for 
example, Friedman, 1991, Therapy for Genetic Diseases, Friedman. Ed.. 
1 0 Oxford University Press, pages 1 05-1 21 . 

Vectors for introduction of genes both for recombination and for 
extrachromosomal maintenance are known in the art, and any suitable 
vector may be used. Methods for introducing DNA into cells such as 
electroporation, calcium phosphate co-precipitation, and viral transduction 
1 5 are known in the art. and the choice of method is within the competence of 
one skilled in the art (Robbins (ed), 1997, Gene Therapy Protocols, Human 
Press, NJ). Cells transformed with a 12q23-qter gene can be used as 
model systems to study chromosome 12 disorders and to identify drug 
treatments for the treatment of such disorders. 
20 Gene transfer systems known in the art may be useful in the practice 

of the gene therapy methods of the present invention. These include viral 
and non-viral transfer methods. A number of viruses have been used as 
gene transfer vectors, including polyoma, i.e., SV40 (Madzak et al., 1992, J. 
Gen. Virol., 73:1533-1536). adenovirus (Berkner, 1992, Curr. Top. Microbiol. 
25 Immunol., 158:39-6; Berkner et al., 1988, Bio Techniques, 6:616-629; 
Gorziglia et al., 1992. J. Virol.. 66:4407-4412; Quantin et al.. 1992. Proc. 
Natl Acad. Sci. USA, 89:2581-2584; Rosenfeld et al., 1992, Cell. 68:143- 
155; Wilkinson et al., 1992. Nucl. Acids Res., 20:2233-2239; Stratford- 
Perricaudet et al.. 1990. Hum. Gene Ther., 1:241-256), vaccinia virus 
30 (Mackett et al., 1992, Biotechnology, 24:495- 499), adeno-associated virus 
(Muzyczka, 1992, Curr. Top. Microbiol. Immunol., 158:91- 123; Oh. et al., 
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1990 Gene, 89:279-282). herpes viruses including HSV and EBV 
(Margolskee. 1992. Curr. Top. Microbiol. Immunol.. 158:67-90; Johnson et 
al 1992, J. VM: 66:2952-2965; Fink et al.. 1992. Hum. Gene Ther.. 3:1 T- 
19- Breakfield et al.. 1987. Mol. Neurobfo/.. 1:337-371; Fresse et al.. 1990, 
5 Bfochem. Pharmacol.. 40:2189-2199). and retroviruses of av,an 
(Brandyopadhyay et al.. 1984. Mol. Cell Biol.. 4:749-754; Petropouplos et 
al 1992 J Virol.. 66:3391-3397), murine (Milter, 1992, Curr. Top. M.croi»o/. 
Immunol.. 158:1-24; Miller et al.. 1985. Mol. Cell Biol.. 5:431^37; Sorge et 
al 1984, Mol. Cell Biol.. 4:1730-1737; Mann et al.. 1985. J. Virol.. 54:401- 
10 407), and human origin (Page et al., 1990, J. Virol.. 64:5370-5276; 
Buchschalcher et al.. 1992, J. Virol.. 66:2731-2739). Most human gene 
therapy protocols have been based on disabled murine retroviruses. 

Non-viral gene transfer methods known in the art include chem,cal 
techniques such as calcium phosphate coprecipitation (Graham et al.. 1973 
15 Virology. 52:456^67; Pellicer e. al., 1980, Science. 

mechanical techniques, for example microinjection (Anderson et al., 1980, 
Proc Natl. Acad. Sci. USA. 77:5399-5403; Gordon et al.. 1980. Proc. Natf. 
Acad Sd. USA. 77:7380-7384; Brinster et al.. 1981, Cell. 27:223-231; 
Constant et al., 1981, Nature. 294:92-94), membrane fusion-mediated 
20 transfer via liposomes (Feigner et al., 1987, Proc. Natl. Acad. Sc. USA 
84-7413-7417; Wang et al.. 1989. B/ochem/sfry. 28:9508-9514; Kaneda et 
al ' 1989, J. Biol. Cnem.. 264:12126-12129; Stewart et al.. 1992. Hum. Gene 
Ther 3 267-275; Nabel et a... 1990, Sdence. 249:1285-1288; Urn et al., 
1992.' C/rcu/a,/on. 83:2007-2011). and direct DNA uptake and receptor- 

, a», m =i 109(1 Science 247:1465-1468; Wu et 
25 mediated DNA transfer (Wolff et al.. 1990. sc/ence. * 

a, 1991. B/oTecnn/quea. 11:474-485; Zenke et a... 1990. Proc. Nat,. Acad. 
Sci. USA. 87:3655-3659; Wu e. a,.. 1989. J. Biol. Cnem.. 
Wofff et al.. 1991. B fo Techn«, U es. 11:474-485; Wagner et al.. 1991 . Pmc. 
Na( , Acad. Sci USA. 88:4255^259; Gotten e, al.. 1990. Proc. Natl. Acad. 
30 Sd. USA. 87:4033^037; Curiel et al., 1991, Proc. Nat,. Acad. Sc. USA. 
88:8850-8854; Curiel et al.. 1991, Hum. Gene Ther.. 3:147-154). 
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In one approach, plasmid DNA is complexed with a polylysine-conjugated 
antibody specific to the adenovirus hexon protein, and the resulting complex 
is bound to an adenovirus vector. The trimolecular complex is then used to 
infect cells. The adenovirus vector permits efficient binding, internalization, 
and degradation of the endosome before the coupled DNA is damaged. In 
another approach, liposome/DNA is used to mediate direct in vivo gene 
transfer. While in standard liposome preparations the gene transfer process 
is non-specific, localized in vivo uptake and expression have been reported 
in tumor deposits, for example, following direct in situ administration (Nabel, 
1992, Hum. Gene Ther., 3:399-410). 

Suitable gene transfer vectors possess a promoter sequence, 
preferably a promoter that is cell-specific and placed upstream of the 
sequence to be expressed. The vectors may also contain, optionally, one or 
more expressible marker genes for expression as an indication of successful 
transfection and expression of the nucleic acid sequences contained in the 
vector. In addition, vectors can be optimized to minimize undesired 
immunogenicity and maximize long-term expression of the desired gene 
product(s) (see Nabe. 1999, Proc. Natl. Acad. Sci. USA 96:324-326). 
Moreover, vectors can be chosen based on cell-type that is targeted for 
treatment. Notably, gene transfer therapies have been initiated for the 
treatment of various pulmonary diseases (see, e.g., M.J. Welsh. 1999. J. 
Clin. Invest. 104(9): 11 65-6; D.L. Ennist, 1999, Trends Pharmacol. Sci. 
20:260-266; S.M. Albelda et al.. 2000, Ann. Intern. Med. 132:649-660; E. 
Alton and C. Kitson C. 2000, Expert Opin. Investig. Drugs. 9(7): 1523-35). 

Illustrative examples of vehicles or vector constructs for transfection 
or infection of the host cells include replication-defective viral vectors, DNA 
virus or RNA virus (retrovirus) vectors, such as adenovirus, herpes simplex 
virus and adeno-associated viral vectors. Adeno-associated virus vectors 
are single stranded and allow the efficient delivery of multiple copies of 
nucleic acid to the cell's nucleus. Preferred are adenovirus vectors. The 
vectors will normally be substantially free of any prokaryotic DNA and may 
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comprise a number of different functional nucleic acid sequences. An 
example of such functional sequences may be a DNA region comprising 
transcriptional and translation^ initiation and termination regulatory 
sequences, including promoters (e.g.. strong promoters, inducible 
5 promoters, and the like) and enhancers which are active in the host cells. 
Also included as part of the functional sequences is an open reading frame 
(polynucleotide sequence) encoding a protein of interest. Flanking 
sequences may also be included for site-directed integration. In some 
situations, the 5'-flanking sequence will allow homologous recombination, 
10 thus changing the nature of the transcriptional initiation region, so as to 
provide for inducible or non-inducible transcription to increase or decrease 
the level of transcription, as an example. 

In general, the encoded and expressed 12q23-qter polypeptide may 
be intracellular, i.e., retained in the cytoplasm, nucleus, or in an organelle, or 
15 may be secreted by the cell. For secretion, the natural signal sequence 
present in a 12q23-qter polypeptide may be retained. When the polypeptide 
or peptide is a fragment of a 12q23-qter protein, a signal sequence may be 
provided so that, upon secretion and processing at the processing site, the 
desired protein will have the natural sequence. Specific examples of coding 
20 sequences of interest for use in accordance with the present invention 
include the 12q23-qter polypeptide-coding sequences disclosed herein. 

As previously mentioned, a marker may be present for selection of 
cells containing the vector construct. The marker may be an inducible or 
non-inducible gene and will generally allow for positive selection under 
25 induction, or without induction, respectively. Examples of marker genes 
include neomycin, dihydrofolate reductase, glutamine synthetase, and the 
like. The vector employed will generally also include an origin of replication 
and other genes that are necessary for replication in the host cells, as 
routinely employed by those having skill in the art. As an example, the 
replication system comprising the origin of replication and any proteins 
associated with replication encoded by a particular virus may be included as 
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part of the construct. The replication system must be selected so that the 
genes encoding products necessa^ for replication do not ultimately 
transform the cells. Such replication systems are represented by 
replication-defective adenovirus (see G. Acsadi et al.. 1994, Hum. Mo/. 
Genet 3 579-584) and by Epstein-Barr virus. Examples of repl.cat,on 
defective vectors, particularty. retroviral vectors that are replication 
defective, are BAG. (see Price et al., 1987. Proc. Natl. Acad. Sd. USA. 
84-156- Sanes et al.. 1986, EMBO J.. 5:3133). It will be understood that the 
final gene construct may contain one or more genes of interest, for example. 
a gene encoding a bioactive metabolic molecule. In addition. cDNA. 
synthetically produced DNA or chromosomal DNA may be employed 
utilizing methods and protocols known and practiced by those having sk,ll .n 
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According to one approach for gene therapy, a vector encodmg a 
12q23-qter polypeptide is directly injected into the recipient cells (m wvo 
gene therapy). Alternatively, cells from the intended recipients are 
explanted, genetically modified to encode a 12q23-qter polypeptide, and 
reimplanted into the donor (ex vivo gene therapy). An ex vivo approach 
provides the advantage of efficient viral gene transfer, which is superior to ,n 
vivo gene transfer approaches. In accordance with ex Wvo gene therapy 
the host cells are first transfected with engineered vectors containing at least 
one gene encoding a 12q23-qter polypeptide, suspended mm 
physiologically acceptable carrier or excipient such as saline o, ^phosphate 
buffered saline, and the like, and then administered to the host. The des*ed 
gene product is expressed by the injected cells, which thus introduce the 
gene product into the host. The introduced gene products can thereby be 
utilized to treat or ameliorate a disorder (e.g.. asthma, obes^ « 
inflammatory bowel disease) that is related to altered levels of the 12o23- 
qter polypeptide. 
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aki im AL MODELS 

ln accordance with the present invention. 12q23-q.er poiyouc.eot.des 

,.B SEQ ID NO:1 to SEQ ID NO:92 and SEQ ID NO: 15 6 to SEQ ID 
NO-4687) can be used to generate genetically altered non-human an.mals 
or human cell lines. Any non-human animal can be used; however typed 
animals are rodents, such as mice, rats, or guinea pigs. Gene«,ca ly 
engineered animals or eel! lines can cany a gene that has been altered to 
contain deletions, substitutions, insertions, or modifications of the 
polynucleotide sequence (e.g.. exon sequence). Such alterations may 
render the gene nonfunctional, (i.e.. a null mutation) producing a -knockout- 
animal or cell line. In addition, genetically engineered animals can carry one 
or more exogenous or non-naturally occurring genes, i.e., -transgenesjhat 
are derived from drrferent organisms (e.g.. humans), or produced by 
synthetic or recombinant methods. Genetically altered animals or ce Unes 
can be used to study 12o23-qter gene function, regulation, and treatments 
for 12q23-q.er-rela.ed diseases. In particular, knockou. animals and ce» 
lines can be used to establish animal models and in vitro models for 12q23- 
q ter-re,ated illnesses, respectively. In addKion, transgenic an.mals 
expressing human 12q23-qter can be used in drug discovery efforts. 

A -transgenic animal" is any animal containing one or more ce«s 
bearing genetic information altered or received, directly or indirect by 
deLrate genetic manipulation at a subcellular level, such as by targeted 
.combination or microinjection or infection with recombinant v,ms. The 
term transgenic animal" is no. intended to encompass c.ass.ca, cross- 
reding or in varo fertile, bu. r,her is mean, to encompass > — nn 
whic h one or more cells are altered by. or receive, a recombinant DNA 
2 ecule This recombinant DNA molecule may be specially targeted to a 
22 genetic locus, may be randomly integrated within a chromosome, or 
ft may be extrachromosomally replicating DNA. 

Transgenic animals can be selected after trea.men. of germkne cells 
or .ygo.es. For example, expression of an exogenous 12q23-q.er gene or a 
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variant can be achieved by operably linking the gene to a promoter and 
optionally an enhancer, and then microinjecting the construct into a zygote 
(see a g.. Hogan et al.. Manipulating the Mouse Embryo, A Laboratory 
Manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY). Such 
treatments include insertion of the exogenous gene and d 1S rupted 
homologous genes. Alternatively, the gene(s) of the animals may be 
disrupted by insertion or deletion mutation of other genetic alterations using 
conventional techniques (see. e.g.. Capecchi, 1989. Science, 244:1288; 
Valancuis et al.. 1991. Mo/. Celt Biol., 11:1402; Hasty et al.. 1991. nature 
, 350 243; Shinkai et al.. 1992. Cell. 68:855; Mombaerts et al.. 1992, Cell, 
68:869; Philpott et al.. 1992, Science, 256:1448; Snouwaert et al.. 1992, 
Science, 257:1083; Donehower et al.. 1992. Nature. 356:215). 

in one aspect of the invention. 12q23-qter gene knockout rmce can 
be produced in accordance with well-known methods (see. e.g ,MA 

~~ c ■ nines*. 1592- P Li et al., 1995, Cell 80:401- 
5 Capecchi, 1989. Saence, 244.1288-1292. ' 

41V LA Galli-Taliadoros et a,., 1995, J. Immunol. Methods 181(1).1-15. 
C H Westphal et a... 1997, Curr. Bio/. 7(7):530-3; S.S. Cheah et al., 2000. 
Methods Mot. Biol. 136:455-63). The disclosed murine 12q23-qter genom-c 
Cone oan be used to prepare a 12q23-qter targeting construct that can 
,0 disrupt 12q23-qter in the mouse by homologous recombination a. the 
' 12 q23-qter chromosomal locus. The targeting construct can compnse a 
disrupted or deleted 12q23-qter gene sequence that inserts in place of the 
functioning portion of the native mouse gene. For example, the construct 
can contain an insertion in the 12q23-qter protein-coding region. 

Preferably, the targeting constmc. contains markers for both pos«ve 
and negative selecbon. The positive selection marker allows the selecbve 
elimination o, ceils that lack the marker, while the negative selecbon «*r 
allows the elimination of cells that carry the marker. In parbcuter ^ 
posKive selectable marker can be an antibiotic resistance gene such asthe 
30 Imycin resistance gene, which can be placed within the coding sequence 
of a 2q23-q,er gene to render it non-functional while a. the same bme 



rend ehn g the construe, selectable. The h erpes s,m .ex 
kl „ase (HSV ft, gene is an example o, a negative seiectaWe m»£ M 
can be used as a second marker to eliminate ceils that cany «. 0*. 
the HSV tK gene are seiectiveiy kiiied in the presence of gangcyCovr. As 
, "example a posKive selection ma*er can he ^ 

oonstmct within the region of the construct that integrates at the io^s of he 
12q23 -oter gene. The negative selection marker can be posted on the 
L^etint construct outside the region that integrates a, 
^4er gene. Thus, if the entire construe, is present ,n the e- Wh 
0 p2Jand negative se,ee«on markers wi,i be present. ,, the construct has 
grated into the genome, the posKive seiection marker « be present, but 
the neoative selection marker will be lost. 

The targeting construct can be empioyed. for exampie. in embryonal 
stem ceil (ES) ES cells may be obtained from pre-implantation embryos 
„ (M J. Evans et a,., 1S81 . Nafure 292, 54-156 -0 £*J 

et a,.. 1984. Na, 0 re 30 9 :255- 2 58. Gossler e, a!.. 1986. Proc Ns,U « Sj 
USA 83-9065-9069; Robertson et a,.. 1986, Narure 322:445^48. S. * 

(Thomas et al.. 1987, Cell 51.t*» 6; 
, 4 ftOQ /-^// sfi- 145-1 47" Capecchi, 1989. /renos in w 

— — rr. it rrr'— ■— 
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c e„s for homologous insertion, followed by screen,n 9 ,nd,v,dual dones (K,m 
e, a... 1988, Nude* Ac/os Res. 16:8887-8903; Kim e, a... 1991. Gene 
103227-233). Another method employs a marker gene is constructed 
which will only be active If homologous insertion occurs, allowing these 
. tclmbinants I be selected direc«y (Sed,y e, a,.. 1989. Pro. m. *** 
SCL USA 86:227-231). For examp.e, the posKive-negabve sele^on (PNS) 
method can be used as described above (see. e.g.. Mansour et a... 1988. 
Lure 336:348-352; Capecchi. 1989. Sconce 244:1288-1292; Capecch, 
1989. Trends in Genet. 5:70-76). In particular, the PNS method ,s useful for 
1 0 targeting genes that are expressed at low levels. 

The absence of functional 12 q 23-ater gene in the knockout m,ce can 
be confirmed, for example, by RNA analysis, protein expression anaiys* 
and functional studies. For RNA analysis, RNA samples are prepared™ 
amerent organs of the Knockout mice and the W# transit 
„ detected in Northern blots using oligonudeotide probes specrfic fo the 
Iscnpt For protein expression detection, antibodies that are spec,f,c for 
" 23^ polypeptide are used, for example, in flow cytometry 
anal s immunohistochemica, staining, and activity assays. Alternate*. 
Inlnal assays are performed using preparations of d.erent ce» types 

20 M slZ! ^ I! used to produce -genie mice. ■» one 
approach — vector is integrated into ES cel, by homologous 
r elbination, an intrachromosoma. recombinabon eve* .sused * 
eliminate the seledable markers, and only the transgene ,s 

25 JO yner et a, 1989. N a ( ure ^ Z 
350(6315,:243-6; . Valanous ^ a « ^ ^ ^ 

imvi402-8: S. Fienng et al., i9»o, r,uu 

lo 8 840-9-73) in an aitemative approach, two or more stra,ns are 
90(18).8469-7^). in d wnocked-out by homologous 

created- one strain contains the gene knocked-oui oy 

3 0 ri'ination. while one or more strains ^^^^ 
strain is crossed with the transgenic stra,n to produce new 
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which the original wild-type allele has been replaced (although not at the 
same site) with a transgene. Notably, knockout and transgenic an.mais can 
be produced by commercial facilities (e.g.. The Lemer Research Instrtute. 
Cleveland. OH; B&K Universal. Inc.. Fremont. CA; DNX Transgenic 
Sciences, Cranbury, NJ; Incyte Genomics. Inc.. St. Louis. MO). 

Transgenic animals (e.g.. mice) containing a nucleic acid molecule 
which encodes a human 12q23-qter polypeptide, may be used as m wvo 
models to study the overexpression of a 12q23-qter gene. Such an,mals 
can also be used in drug evaluation and discover efforts to find compounds 
effective to inhibit or modulate the activity of a 12q23-qter gene, such as for 
example compounds for treating respirator disorders, diseases, or 
conditions. One having ordinal skill in the art can use standard techn,ques 
,o produce transgenic animals which produce a human 12q23-qter 
polypeptide, and use the animals in drug evaluation and discovery onsets 
(see. e.g., U.S. Patent No. 4,873,191 to Wagner, U.S. Patent No. 4,736.866 

t0 ^"n' another embodiment of the present invention, the transgenic 
animal can comprise a recombinant expression vector in which the 
nucleotide sequence that encodes a human 12q23-qter polypepfide . 
, operably linked to a tissue specific promoter whereby the cod.ng sequence 
is only expressed in that specific tissue. For example, the tissue specific 
promoter can be a mammary cell specific promoter and the recomb,nant 
protein so expressed is recovered from the animal's milk. 

in yet another embodiment of the present invent™, a 12q23-qter 
5 ge ne "knockout" can be produced by administering to the animal antibody 
(eg neutralizing antibodies) that specifically recognize an endogenous 
2 3-qter polypepfide. The antibodies can ac, to disrupt 
endogenous 12q23-qter po.ypeptide, and thereby produce a nul phe**^ 
n one specific example, an orthologous mouse 12q23-qter polypeptide or 
iQ e fide L be used to 3 enerate antibodies. These antibodies can * grven 
o a mouse to knockout the function of the mouse 1 2q23-q.er ortholog. 
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in another embodiment of the present invention. non-mammal,an 
organisms may be used to study 12 q23-qter genes and 1WM««J 
diseases. m particular, mode, onanisms suoh as C. etegans. a 
metenogaster, and S. cerevisfae may be used. Orthologs of 12*3-«» 
genes can be identified in these model organisms, and mutated or deleted 
to produce strains deficient for 12q23-qter genes. Human 12q23^ter genes 
can then be tested for the abi.ity to -complement- the deficient strains. Such 
strains cen also be used for drug screening. The 12q23-qter orthoiogs can 
be used to facilitate the understanding of the biological function of the 
human 1 2q23-qter genes, and assist in the identification of binding factors 
(e.g., agonists, antagonists, and inhibitors). 
GENE IDENTIFICATION. 

- To identKy genls in the region on 12q23-qter. a se, of bactenal 
artificial chromosome (BAG) clones containing this chromosomal reg-on was 
Identified. The BAG clones se.ed as a tempiate for genomic DMA 
sequencing and as reagents for idenfifying coding sequences by d.reC 
CDMA selection. Genomic sequencing and direct cDNA 
,o characterize DNA from 12q23-qter in accordance w«h the methods 

described in detail herein. .nacHic 
When a gene has been genetically located to a specie 
chromosomal region, the genes in this region can be charactered at me 
molecular level by a series of steps that include: (1 ) Coning *e enfire^n 
of DNA in a set of overlapping genomic Cones (phys,ca. mapping). (2) 
aractenzing the genes encoded by these Cones by a comb,nat»n * 
■ 5 Lee. cDNA seleCion, exon trapping and DNA sequencing (gene 
5 SH« and (3, idenfifying mutations in these gen* >by com = 
DNA sequencing of affected and unaffeCed members of the k,nd reds and/or 
in unrelated affected individuals and unrelated unaffeCed controls (mutafion 



analysis). 
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Physical mapping is accomplished by screening libraries of human 
DNA Coned in veCors that are propagated in a host such as E. coll, us,ng 
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hybridization or PGR assays from unique moieoular landmarks ,n the 
chromosomal region o. interest. To generate a physical map 
region, a library of human DNA cloned in BACs was screened wth a set 
overgo markers that had been previously mapped to chromosome 12q23- 
5 P ter by the efforts of the Human Genome Project. Overgos are unique 
molecular landmarks in the human genome that can be assayed by 
hybridization. Through the combined efforts of the Human Genome Project, 
the location of thousands of overgos on the twenty-two autosomes and two 
sex chromosomes has been determined. For a positional cloning effort the 
,0 Physical map is tied to me genetic map because the markers used for 
genetic mapping can also be used as overgos for physical mapp,ng. By 
screening a BAC library with a combination of overgos derived from genehc 
markers, genes, and random DNA fragments, a physical map compnsed of 
overlapping clones representing all of the DNA in a chromosomal reg,on of 

15 inter n:;Tcr:vectors „ » — . * « 

segments of human or other DNA that are propagated in E. cot. To 
construct a physical map using BACs, a library of BAC Cones « 
so that individual clones harboring the DNA sequence correspond^ to a 

20 given overgo or set of overgos are identKed. Throughout most 

human genome, the overgo markers are spaced approximately 20 * SO 
ki ,obases apart, so that an individual BAC clone typically contains a, least 
To ergo barkers. In addHion, the BAC iibraries that were screen* 
Tntain enough Coned DNA to cover the human genome twelve hmes ove, 

25 Accordingly, an individual overgo typically identifies more than one BAG 
Te By screening a twelve-fold coverage BAC library with a senes of 

consisting of a series of ovenapping contiguous BAG Cones. ^BAG 
■contigs." can be assembled for any region of the human genome. Th« map 

to prepare the physical map are also genetic markers. 
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When constructing a physical map. it often happens that there are 
gaps in the overgo map o. the genome tha, resuit in the inabi,* «o , en«y 
BAG clones tha. are overlapping in a giver, location. Typrcally, the phys at 
m ap is firs, constrocteo from a set of overgos identKed through the pub ^ 
avlble merature and Wond Wide Web resources. The intral map cons** 
o. several separata BAG contigs tha, are separated by gaps o» unKn w. 
secular distance. To identify BAG clones that fil. these gaps, t . 
rlsary to develop new overgo markers from the ends of the .ones o 
either side of the gap. This is done by sequencing the termma, 200 to 300 
base pairs of the BACs flanking the gap. and developing a PGR or 
h" d Lion based assay, if ft. terminal seguences are demons^ o 
Z unigue within the human genome, then the new overgo can be used Jo 
screen the BAG library to identify addWonal BAGS that contain the DNA from 
hTga P * *e Physical map. To assemble a BAG contig tha, covers a 
region he s*e Hhe disorder region (6,000.000 or more base parrs^ * . 
X sary to develop new overgo markers from the ends of a number of 
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After building a BAG contig, this se, of overlapping clones serves as a 
templet fir iden,U «- «""« encoded in the chromosomal regroa 
oT«— « - be accomplished by many method. Three rneftod 
are commonly used: (1) a set of BACs selected from the BAG cont,g 
lilt the entire chromosomal region can be seguenced and 
Z— «~ can be used ,o iden,, ^ ^ 

BACs from the BAC contig can be used as a reagem 

BAOS Trom method termed 

identify coding sequences by selecfingfo P 

a procedure called exon trapping. The presen, 

\ ^nme 1 2023-qter genes identified by the first two methods. 

" rom0 T oCen" Z entire BAG contig representing the disorder regio. 

a set of BACs can be chosen for subcloning into plasmid vectors and 
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sequent DNA sequencing o, these stones, -j. «. - JU* - * 
the BACs represents genomic DNA. ft. sequences refer* 
"nomic sequencing to M • - sequ = — 

gnomic -encing ,0, a T^^^^ 
. nninn o AC dones are chosen. DNA tor eacn 

■ ;rr on" are into «*» «- ^ - 

s Lequent>y Coned into standard plasmid vectors such as P UC18 The 
a, .en grown to ^^ZZZZ 

these are the ^x::oZ:^:zj^ — « 

0 sequence quality for the BAC DNA s For example, if 

sequenced to yieid three-fo.d coverage of the BAC do* 

the bac is too ^^:^;:rs:X^ 

kilob ases of sequence. Srnce th BAC DNA ^ 
cloning in the phagemid vector, the 300 ktobases 

termed sequence cont.gs. For the purp fa 
^tationa, method, three-foid coverage^ -* BAC s ^ 
yield twenty to forty sequence cont,gs of 1000 P 

P a,rs - . ^ •„ thie invention was to initially 

20 T he sequencing « -^r,^ region. The 

— "^L "I BACs Is J used to iden«y minimally 
sequence of the seed Bfto subs equently 
overlapping BACs from the con„g, a these 
sequenced. In mis manner, the en.re — ^ ^ 
25 with several small sequence gaps left ,n each BA 
esftetemp.atefo^a-.e^ 

,n one approach, genes can ^ 

- bac -tig to pu^y — J^II*^—* 

sequences, e.g.. Un,Gene ; _ The BAC DNA sequence can also 
30 and the DNA Database of Japan (DDBJ). ^ ^ used 

be translated into protein sequence, and the prote.n 
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to search pub.idy available protein database, e.g.. GenPept. EMBL mte, 
SZe Protein information Resource (PIR). Protein Data BanK (PDB, 
ttw^S PROT These comparisons are typicolly done using me BLAST 
ZZZZL a^onthms and programs ^ et ^ Mo,. 

For nucleotide quenes. BLASTN, BLAS1X. anu 

used BLASTN compares a nudeotide query sequence with a nudeot.de 
used. BLASTN co p sequence 

r:: r a r:: ix^n, . p- ~~ — 

T^S compares the sMrame translations of a nudeotide query 
-st the s,frame „ ^ ^ 
database For protein quenes, BLASTP ana 

database. cenllpnce wit h a protein sequence 

^HeTB^N 3 nrTpZr^ sequence against a 

Additionally, computer algorithms such as MZEF (Z tan* 199T.P 
Na , Acad. Sc, USA 94:56,068, «WL (Uberbacher et 
c ™, 266-259-281) and Genscan (Burge and Kari.n, 1997, J- mo 

- * i to predict the iocation of axons in the sequent 
268.78 94) can ^ ^ ^ to 

0 based on the presence of speofic DNA q 

all exons, as well as the presence of codon usage typica 

^ genes by computed method, genes „ 

be identtfed by dired cONA seledion (Del Mastro and Lo^L «. 
25 Mefnods in Mo,ecu,ar M Humana Press In -n 

— cdna err :i * ^ - 

the candidate region are used in iiqu 
cDNAs which base pair to »ding regions ,n toe B AC 
described herein, toe cDNA pools were creat d fmm se 
30 tissues by random priming and oligo dT J^^L. methods, 
poly A' RNA, synthesizing the second-strand cDNA by 
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and adding linkers to the ends of the cDNA fragments. In this approach, the 
linkers are used to amplify the cDNA pools of BAC olones from the disorder 
region identified by screening a BAC library. The amplified products are 
then used as a template for initiating DNA synthesis to create a b,ot,n 
labeled copy of BAC DNA. Following this, the biotin labeled copy of the 
BAC DNA is denatured and incubated with an excess of the PCR amplrfied. 
Unkered cDNA pools which have also been denatured. The BAC DNA and 
cDNA are allowed to anneal in solution, and heteroduplexes between the 
BAC and the cDNA are isolated using streptavidin coated magnetic beads. 
The cDNAs that are captured by the BAC are then amplified using pnmers 
complimentary to the .inker sequences, and the hybridization/selection 
process is repeated for a second round. After two rounds of direct cDNA 
selection, the cDNA fragments are cloned, and a library of these d.rect 

selected fragments is created. 

The cDNA clones isolated by direct selection are analyzed by two 
methods. Since a pool of BACs from the disorder region is used to provide 
the genomic target DNA sequence, the cDNAs must be mapped to BAC 
genomic clones to verify their chromosomal location. This is aceomphshed 
by arraying the cDNAs in microtlter dishes, and replicating their DNA ,n h,gh- 
density grids. Individual genomic clones known to map to the region are 
then hybridized to the grid to identify direct selected cDNAs mapping to that 
region. cDNA Cones that are confirmed to correspond to ,n .vidua, BACs 
are sequenced. To determine whether the cDNA clones iso.ated by direct 
selection share sequence identity or similarity to previously idenfified gene* 
25 the DNA and protein coding sequences are compared to pubhdy liable 
databases using the BLAST family of programs. 

The combination of genomic DNA sequence and cDNA sequence 
provided by BAC sequencing and by direct cDNA selection yields M jnM 
s , of putative genes in the region. The genes in the reg,on were a . 
30 candidates for the asthma locus. To further cnaractenze ^each ge^ 
Northern blots were performed to determine the s,ze of the transcnp. 
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corresponding to each gene, and to determine which putative exons were 
transcribed together to make an individual gene. For Northern blot analysis 
of each gene, probes were prepared from direct selected cDNA clones or by 
PGR amplifying specific fragments from genomic DNA. cDNA or from the 
BAG encoding the putative gene of interest. The Northern blots gave 
information on the size of the transcript and the tissues in which it was 
expressed. For transcripts that were not highly expressed, it was 
sometimes necessary to perform a reverse transcription PCR assay using 
RNA from the tissues of interest as a template for the reaction. 

Gene identification by computational methods and by direct cDNA 
selection provides unique information about the genes in a region of a 
chromosome. When genes are identified, then it is possible to examine 
different individuals for mutations in each gene. Variants in gene sequences 
between individuals can be inherited allelic differences or can arise from 
mutations in the individuals. Gene sequence variants are clinically important 
in that they can affect drug action on such gene. Most drugs elicit a safe 
re sponse in only a fracfion of individuals, and drugs are common* 
administered to patients wKh no certainty that they will be safe and effect,^ 
Many important drugs are effective in only 3<M0% of patients 
0 drug is prescribed, and virtually all drugs cause adverse events in some 
Jviduals. Identification of mutations in disorder genes ,n dfleren, 
individuals wil, enable a correlation between the safety and efficacy drug 
therapies used to treat lung diseases and the genotypes of the treated 
individuals. This correlate enables health care providers to prescnbe a 
« drug regimen that is most appropriate for the individua, patient ra*er »an 
trying dKferent drug regimens in turn until a successful drug - 
location of variants in disorder genes wil. also have a benefi unng £ 
development of new drugs for the treatment of lung diseases, as 
to correlate genetic variation with the efficacy of new cend.da.e drugs w. 
30 lance lead optimization and increase the efficiency and success rate of 
new drug approvals. 
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Gene identification by computational methods and by direct cDNA 
selection provides unique information about the genes in a region of a 
chromosome. Once genes are identified, it is possible to examine sub,ects 
for sequence variants. Variant sequences can be inherited as allelic 
differences or can arise from spontaneous mutations. Inherited alleles can 
be analyzed for linkage to a disease susceptibility locus. Linkage analysis is 
possible because of the nature of inheritance of chromosomes from parents 
to offspring. During meiosis, the two parental homologs pair to guide their 
proper separation to daughter cells. While they are paired, the two 
homologs exchange pieces of the chromosomes, in an event called 
"crossing over" or -recombination." The resulting chromosomes contam 
parts that originate from both parental homologs. The closer together two 
sequences are on the chromosome, the less likely that a recombination 
event will occur between them, and the more closely linked they are. 

Date obtained from the different families can be combined and 
analyzed together by a computer using statistical methods described herem. 
The results can then be used as evidence for linkage between the genetic 
markers used and an asthma susceptibility locus. In general, a 
recombination frequency of 1% is equivalent to approximately 1 map unit a 
relationship that holds up to frequencies of about 20% or 20 cM. One 
centimorgan (cM) is roughly equivalent to 1,000 Kb of DNA. The entire 
human genome is 3.300 cM long. In order to find an unknown disease gene 
within 5-10 cM of a marker locus, the whole human genome can be 
searched with roughly 330 informative marker loci spaced at approximately 
, 10 cM intervals (Botstein et a... 1980. Am. J. Hum. Genet. 32:314-331). 

The reliability of linkage results is established by using a number of 
statistical methods. The methods most commonly used for the detection by 
linkage analysis of oligogenes involved in the etiology of a complex ra* are 
non-parametric or model-free methods which have been implemented info 
j the computer programs MAPMAKER/SIBS (L. Kmglyak and E.S. Lander 
995 Am. J. Hum. Gene,. 57:439^54) and GENEHUNTER (L. Kmg.yak e, 
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al 1996 Am. J. Hum. Genet 58:1347-1363). Typically, linkage analys.s • 
performed by typing members of famil.es wrth multiple affeoted individuals at 
a given marker loous and evaluating if the affected members (exclud,ng 
parenting pairs) share alleles at the marker locus that are identical by 
5 descent (IBD) more often than expected by chance alone. 

As a result of the rapid advances in mapping the human genome over 
the last few years, and concomitant improvements in computer 
methodology, it has become feasible to carry out .inkage analyses using 
multi-point data. Multi-point analysis provides a simultaneous analyse of 
10 linkage between the .ran and several linked genetic markers, when the 
recombination distance among the markers is known. A LOD score stabsbc 
is computed at multiple locations along a chromosome to measure the 
evidence that a susceptibilrty locus is located nearby. A LOD score is the 
.ogartthm base 10 of the ratio of the likelihood that a suscep.ibi.rty locos 
« exis.s a. a given location to the likelihood that no susceptiblrty locus -s 
located there. By convention, when testing a single marker, a total LOD 
score greater man +3.0 (that is. odds of linkage being 1.000 times greater 
than odds of no linkage) is considered to be significant evidence for linkage. 
Multi-point analysis is advantageous for two reasons. F.rst. the 
20 informativeness of the pedigrees is usually increased. Each pedigree has a 
lain amount of potential information, dependent on the number o, paren 
heterozygous for the marker loci and the number of affected — s , 
the family. However, few markers are sufficiently polymorph, as to be 
ormaJe in a„ those ind.iduals. If multiple markers are censored 
25 murtaneously. then the probability of an individual being hetero^gous , *r 
at least one of the markers is greatly increased. Second, an ,nd,catK,n of 
posrtion o, the disease gene among the marke. may be de— 
This allows identification of flanking markers, and thus eventually allows 
identification of a small region in which the disease gene res,des. 
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EXAMPLES 

The examples as set forth herein are meant to exemplify the various 
aspects of the present invention and are not intended to limit the invention in 
any way. 

5 FX AMPLE 1: FAMILY COLLECTION 

Asthma is a complex disorder that is influenced by a variety of 
factors, including both genetic and environmental effects. Complex 
disorders are typically caused by multiple interacting genes, some 
contributing to disease development and some conferring a protective effect. 
1 o The success of linkage analyses in identifying chromosomes with significant 
LOD scores is achieved in part as a result of an experimental design tailored 
to the detection of susceptibility genes in complex diseases, even in the 
presence of epistasis and genetic heterogeneity. Also important are 
rigorous efforts in ascertaining asthmatic families that meet strict guidelines. 
1 5 and collecting accurate clinical information. 

Given the complex nature of the asthma phenotype. non-parametric 
affected sib pair analyses were used to analyze the genetic data. This 
approach does not require parameter specifications such as mode of 
inheritance, disease allele frequency, penetrance of the disorder, or 
20 phenocopy rates. Instead, it determines whether the inheritance pattern of a 
chromosomal region is consistent with random segregation. If it is not. 
affected siblings inherit identical copies of alleles more often than expected 
by chance. Because no models for inheritance are assumed, allele-sharing 
methods tend to be more robust than parametric methods when analyzing 
25 complex disorders. They do, however, require larger sample sizes to reach 

statistically significant results. 

At the outset of the program, the goal was to collect 400 affected sib- 
pair families for the linkage analyses. Based on a genome scan with 
markers spaced -10 cM apart, this number of families was pred.cted to 
30 provide > 95% power to detect an asthma susceptibility gene that caused an 
increased risk to first-degree relatives of 3-fold or greater. The assumed 
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relative risk of 3-fold was consistent with epidemiological studies in the 
literature that suggest an increased risk ranging from 3- to 7-fold. The 
relative risk was based on gender, different classifications of the asthma 
phenotype (i.e., bronchial hyper-responsiveness versus physician's 
5 diagnosis) and, in the case of offspring, whether one or both parents were 
asthmatic. 

The family collection efforts exceeded the initial goal of 400. and 
resulted in a total of 444 affected sibling pair (ASP) families, with 342 
families from the UK and 102 families from the US. The ASP families in the 
10 US collection were Caucasian with a minimum of two affected siblings that 
were identified through both private practice and community physicians as 
well as through advertising. A total of 102 families were collected in Kansas. 
Nebraska, and Southern California. In the UK collection, Caucasian families 
with a minimum of two affected siblings were identified through physicians' 
15 registers in a region surrounding Southampton and including the Isle of 
Wight. In both the US and UK collections, additional affected and 
unaffected sibs were collected whenever possible. 

An additional 63 families from the United Kingdom were utilized from 
an earlier collection effort with different ascertainment criteria. These 
families were recruited either: 1) without reference to asthma and atopy; or 
2) by having at least one family member or at least two family members 
affected with asthma. The randomly ascertained samples were identified 
from general practitioner registers in the Southampton area. For families 
with affected members, the probands were recruited from hospital based 
clinics in Southampton. Seven pedigrees extended beyond a s.ngle nuclear 
family The phenotypic and genotypic data information for 17 markers for 21 
of these 63 families was obtained from the website http://cedar.genet.es. 
soton ac.uk/pub /PROGRAMS/BETA/data/bet12.ped. 

Families were included in the study if they met all of the following 
30 criteria: 1 ) the biological mother and biological father were Caucasian and 
agreed to participate in the study; 2) at least two biological siblings were 
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alive, each with a current physician diagnosis of asthma, and were 5 to 21 
years of age; and 3) the two siblings were currently taking asthma 
medications on a regular basis. This included regular, intermittent use of 
inhaled or oral bronchodilators and regular use of cromolyn, theophylline, or 
steroids. 

Families were excluded from the study if they met any one of the 
following criteria: 1) both parents were affected (i.e.. with a current 
diagnosis of asthma, having asthma symptoms, or on asthma medications 
at the time of the study); 2) any of the siblings to be included in the study 
was less than 5 years of age; 3) any asthmatic family member to be 
included in the study was taking beta-blockers at the time of the study. 4) 
any family member to be included in the study had congenital or acquired 
pulmonary disease at birth (e.g., cystic fibrosis), a history of serious cardiac 
disease (myocardial infarction), or any history of serious pulmonary d.sease 
(e.g., emphysema); or 5) any family member to be included in the study was 
pregnant. 

An extensive clinical instrument was designed and data from all 
participating family members were collected. The case report form (CRF) 
included questions on demographics, medical history including med.cat.ons. 
a health survey on the incidence and frequency of asthma, wheeze, 
eczema, hay fever, nasal problems, smoking, and questions on home 
environment. Data from a video questionnaire designed to show vanous 
examples of wheeze and asthmatic attacks were also included in the CRF. 
Clinical data, including skin prick tests to 8 common allergens, total and 
25 specific IgE levels, and bronchial hyper-responsiveness follows a 
methacholine challenge, were also collected from all participating family 
members. All data were entered into a SAS dataset by IMTCl. a CRO; 
either by double data entry or scanning followed by on-screen v,sual 
validation. An extensive automated review of the data was performed on a 
30 routine basis and a full audit at the conclusion of the data entry was 
completed to verify the accuracy of the dataset. 
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py^MPi p GENOME SCAN 

In order to identify chromosomal regions linked to asthma, the 
inheritance pattern of alleles from genetic markers spanning the genome 
was assessed on the collected family resources. As described above, 
5 combining these results with the segregation of the asthma phenotype in 
these families allows the identification of genetic markers that are tightly 
linked to asthma, in turn, this provides an indication of the location of genes 
predisposing affected individuals to asthma. The genotyping strategy was 
twofold: 1) to conduct a genome wide scan using markers spaced at 
1 o approximately 1 0 cM intervals; and 2) to target ten chromosomal regions for 
high density genetic mapping. The initial candidate regions for high-density 
mapping were chosen based on suggestions of linkage to these reg.ons by 

other investigators. 

Genotypes of PCR amplified simple sequence microsatellite genetic 

1S linkage markers were determined using ABI model 377 Automated 
Sequencers <PE Applied Biosystems). Microsatellite markers were obtained 
from Research Genetics Inc. (Huntsville, AL) in the fluorescent dye- 
conjugated form (see Dubovsky et al., 1995. Hum. Mo/. Genef. 4<3):449- 
452) The markers comprised a variation of a human linkage mapping panel 
20 as released from the Cooperative Human Linkage Center (CHLC). also 
known as the Weber lab screening set version 8. The variation of the 
Weber 8 screening set consisted of 529 markers with an average spacing of 
6 9 oM (autosomes only) and 7.0 cM (all chromosomes). Eighty-nine 
percent of the markers consisted of either tri- or tetra-nucleotide 
25 microsate.li.es. There were no gaps presen. in chromosomal coverage 
greater than 17.5 cM. 

Study subject genomic DNA (5 pi; 4.5 ng/pl) was amplified in a 10 pi 
PCR reaction using AmpliTaq Gold DNA polymerase (0.225 U); 1 X PCR 
buffer (80 mM (NH.feSO.; 30 mM Tris-HCI (pH 8.8); 0.5% Tween-20); 200 
30 pM each dATP. dCTP. dGTP and dTTP; 1.5-3.5 pM MgCI 2 ; and 250 pM 
foiward and reverse PCR primers. PCR reactions were se. up ,n 192 we.l 



120 

plates (Coster) using a Tacan Genesis 150 robotic workstation equipped 
with a refrigerated deck. PGR reactions were overlaid with 20 pi mineral 0,1. 
and thermocycied on an MJ Research Tetrad DNA Engine equipped wrth 
four 192 we.l heads using the following conditions: 92°C for 3 min; 6 cycles 
5 of 92-C for 30 sec. 56-C for 1 min. 72X for 45 sec; Mowed by 20 cycles of 
92°C for 30 sec. 55-C for 1 min. 72°C for 45 sec; and a 6 min incubabon at 

" ° PCR products of 8-12 microsatellite markers were subsequently 
pooled into two 96-well microtitre plates (2.0 pi PCR product from TET and 
10 FAM labeled markers. 3.0 pi HEX labeled markers) using a Tecan Genesis 
200 robotic workstation and brought to a final volume of 25 pi with H 2 0. 
Following this. 1.9 pi of pooled PCR product was transferred to a loading 
p , a »e and combined with 3.0 p. loading buffer (2.5 pi formamide/blue dextran 
(9 0 mg/ml). 0.5 pi GS-500 TAMRA labeled size standard. ABI). Samples 
15 were denatured in the loading plate for 4 min a. 95»C. placed on ice for 2 
min and electrophoresed on a 5% denaturing polyacrylam.de gel (FMC on 
the ABI 377XL). Samples (0.8 p.) were loaded onto the gel using an 8 
channel Hamilton Syringe pipettor. 

Each gel consisted of 62 study subjects and 2 control sub,ects 
20 (CEPH parents ID #1331-01 and 1331-02. Coriel, Cell Repository. Camder, 
NJ ). Genotyping gels were scored in duplicate by invest gators 
patient identKy and affection status using GENOTYPER ana,ys* ^ 
1112 (ABI; PE Applied Biosystems). Nuclear families were loaded onto the 
„e, with the parents flanking the siblings to facilitate error detection. The 

ge i . „, the rFNOTYPER output for each gel analysed 
25 final tables obtained from the GENOi YKtK ouipu 

were imported into a SYBASE Database. 

aL calling (binning) was performed using the SYBA E veraior , a, 

u «♦ oi 1QQ7 Genome Research 7:lbi>-ToJ- 
the ABAS software Ghosh et al., 199'. <*enom 

O si bins were checked manually and incorrect calls were corrected o 
blanked The binned alleles were then imported into the program MENDEL 
et a... 1988. Gene,ic E P «e M 5:471) for inheritance checking 
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using the USERM13 subroutine (Boehnke et al.. 1991. Am. J. Hum. Genet. 
48 22-25) Non-inheritance was investigated by examining the genotyping 
traces and. once all discrepancies were resolved, the subroutine USERM13 
was used to estimate allele frequencies. 

FXAMPLE 3: ' 1NKAGE ANALYSIS 

Chromosomal regions harboring asthma susceptibility genes were 
identified by linkage analysis of genotyping data and three separate 
phenotypes, asthma, bronchial hyper-responsiveness, and atopic status. 

1 A^hma Phenotvoe : For the initial linkage analysis, the 
, phenotype and asthma affection status were defined by a patient who 
answered the following questions in the affirmative: i) Have you ever had 
asthma'' ii) Do you have a current physician's diagnosis of asthma? and ,..) 
Are you currently taking asthma medications? Medications included inhaled 
or oral bronchodilators, cromolyn, theophylline, or steroids. Multipoint 
5 linkage analyses of allele sharing in affected individuals were performed 
using the MAPMAKER/SIBS analysis program (L. Kruglyak and E.S. 
Lander, 1995, Am. J. Hum. Genet 57:439^54). The analyses were 
performed using 54 polymorphic markers spanning a 162 cM region on both 
arms of chromosome 12. The map location and distances between makers 
,0 were obtained from the genetic maps published by the Marshfield medical 
research foundation (http://www.marshmed.org/genetics/). Ambiguous 
ordering of markers in the Marshfield map was resolved using the program 
MULTIMAP(T.C.Matiseetal., 1994, Nature Genet 6:384-390). 

Figure 1A shows the multipoint LOD score against the map location 
25 o, makers along chromosome 12. A Maximum LOD Score (MLS) of 2* 
based on 484 nuclear families, was obtained at location 161.7 cM. 1.0 cM 
distal to markers D12S97 and D12S1045. An excess sharing by descent 
(Identity By Descent; IBD=2) of 0.31 was observed at the MLS. Table 1B 
shows the two-point and multipoint LOD scores at each marker. 
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1B: CHROMOSOME 12 LINKAGE ANALYSIS 
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D12S2070 
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D.7 


D12S366 


133.3 


1.2 


1.7 


D12S1619 


134.5 


0.8 


1.8 


D12S385 


135.1 


2.0 


1.6 


PLA2G1B 


136.8 


0.9 


1.4 


D12S395 


136.8 


2.1 


1.5 


D12S300 


140.2 


0.9 


1.7 


D12S342 


144.8 


1.6 


2.2 


D12S324 


147.2 


1.3 


1.4 


D12S2078 


149.6 


0.9 


1.9 


D12S1659 


155.9 


0.3 


1.6 


D12S97 


160.7 


0.9 


2.7 


D12S1045 


160.7 


3.0 


2.8 


D12S392 


165.7 


1.1 


2.3 


D12S357 


168.8 


0.8 


1.1 



2 Ph-nntynir. Suboroups : Nuclear families were ascertained by 
the presence of at .east two affected siblings with a current physican's 
diagnosis of asthma, as well as the use of asthma medication. In the ,n*a 
analysis (see above), the evidence was examined for linkage based on that 
dichotomous phenotype (asthma - yes/no). To further characterize the 
linkage signals, additional quantitative traits were measured in the Cmcal 
protocol. Since quantttative trait loci (QTL, analysis tools with corrects for 
ascertainment were not available, the foilowing approach was taken to refine 

the linkage and association analyses: 

i Phenotypic subgroups that could be ind,cat«e of an 

undoing genotypic heterogeneity were identified. Asthma subgroups .were 
defined according to 1) bronchia, hyper-respons,veness (BHR to 
me thacho,ine challenge; or 2) atopic status using quantttafive measures l,ke 
total serum IgE and specific IgE to common allergens. 

ii Non-parametric linkage analyses were performed on 
subgroups to test for the presence of a more homogeneous sub-sample. If 
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genetic heterogeneity was present in the sample, the amount of a,lele 
sharing among phenotypically simiiar siblings was expected to increase ,n 
, he appropriate subgroup in comparison to the fu.l sample. A narrower 
region of significant increased al.ele sharing was also expected to result 
unless the overall LOD score decreased as a consequence of hav,ng a 
smaller sample size and of using an approximate partitioning of the data. 

3 ° * for ""R and loE: PC 2 o. the concentrate of 

methacholine resulfing in a 20% drop in FEV, (forced expiratory volume), 
was polychotomized into four groups and analyses were performed on the 
subsets of asthmatic children with borderline to severe BHR (PC* s 
mg/ml) or PC 20 <16). As shown in the LOD plot in Figure 1 B. the MLS for the 
subset of 218 nuclear families with at least two PC M (16) affected s,bs was 
22 at D12S342 with an excess sharing of 0.33. The linkage results 
imp „ca.ed a region of chromosome 12 centromeric to the region wrth he 
taUt signal under the asthma phenotype (Figure 1A) and ,nd,ca,ed the 
presence of one or more genes w«h specific suscephb.lrty toward BHR. 
Since the BHR sample represented a subset of the sample of asthmahcs. 
elucidated the presence of multiple peaks in the LOD plot of Figure 1 A 

Total IgE was dichotomized using an age specific cutoff for elevated 
, ievels (one standard deviation above the mean: 52 kU/L for age 5-9; 63 
Lul for age 10-14; 75 KU,L for age 15-18; and 81 kU/L for adults^ 
Silriy. a dichotomous variable was created using specific IgE to common 
a tens. An indrvk.ua, was assigned a high specific ,gE value , h,s*e 
J« positive (grass or tree) or elevated (> 0.35 KU/L for cat. dog. mrte 
5 A mite B. altemaria, or ragweed) for at least one such measure 

,n linkage analyses, the subset of asthmatic children w,th h,gh tota. 
lgE (2 famiL) gave a maximum LOD score of 2.3 at D12S16 • 
,'c, wUh an excess sharing of 0.33. The subset with high spe* 288 
families, gave a LOD score of 2.2 at 164.2 cM. 1.6cM prox,ma, to ma*.r 
» 01 S392 wHh an excess sharing of 0.33 (Figure 1D). The analys.s w*. the 
s bi o asthmatic sibs w*h eievated tota, IgE implicated a region s.mHar to 
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the one identified with the BHR subset The region implicated by the subset 
of asthmatic with elevated specific IgE coincided with the location of the 
laigest signal in the original asthma sample. 

Accordingly, a pattern of evidence by linkage analysis pointed to the 
5 existence of several asthma susceptibility loci in the 12q23-ter region of 
chromosome 12. This was supported by the initial analysis of the asthma 
(yes/no) phenotype with further localization by analyses of BHR. total IgE. 
and specific IgE in asthmatic individuals. Thus, chromosome 12q23-ter 
encompassed genes involved in asthma and related diseases thereof. 

10 cvaMPl E 4: p uvsif^AL MAPPING 

The linkage results for chromosome 12 described above were used 
to delineate a candidate region for disorder-associated gene(s) located on 
chromosome 12. Gene discovery efforts were initiated in a -43 cM .nterval 
from marker D12S2070 to the 12q telomere, representing a 99% confidence 
15 interval. All genes known to map to this interval were cons,dered 
candidates. Figures 2A-2P show genes mapped against the GB4 panel and 
Figures 3A-3G show genes mapped against the Stanford G3 panel. The 
figures were obtained directly from the GeneMap99 web site. 

Physical mapping (BAC contig construction) focused on a -22 cM 
20 interval approximately between markers D12S307 and D12S2341 The 
discovery of novel genes using direct cDNA selection focused on a -15 cM 
region between markers D12S1609 and D12S357. Figure 4 shows *e 
Juration of the Marshfield Center for Medical Genefics 
( http//www.marshmed.org/genetics/) genetic map with GeneMa P 99 from 
25 NCBI The relevant regions are ind,cated at the top of the figure. 

The following section describes the construction of a BAC contig 
spanning the disorder gene region on chromosome 12. This approach , was 
used: 1) to provide genomic clones for DNA sequencing (analys,s of this 
sequence would provide informafion about the gene content 
30 and 2) to provide reagents for direct cDNA selection (and prov.de add ,ona. 
nformation about novel genes mapping to the interval,. The phys,ca, map 
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consisted of an ordered set of molecular landmarks, and a set of BACs (U.- 
J Kim et al., 1996. Genomics 34:213-218; H. Shizuya et al.. 1992. Proc. 
Naff. Acad. Scf. USA 89:8794-8797) that contained the disorder gene region 
from human chromosome 12q23-qter. 
5 Figures 5A-5I show the BAC/STS content contig map of human 

chromosome 12q23-qter. Markers used to screen the RPCI-11 BAC library 
(P deJong. Roswell Park Cancer Institute (RPCI)) are shown in the top row. 
Markers that were present in the Genome Database (GDB. 
http://gdbwww.gdb.org/) are represented by GDB nomenclature. The BAC 
1 o clones are shown below the markers as horizontal lines. 

1 Man Integration . Various publicly available mapping resources 
were utilized to identify existing STS (sequence tagged site) markers in the 
12q23-qter region (Olson et al.. 1989, Science. 245:1434-1435). Resources 
included GDB (http://gdbwww.9db.0rg/), Genethon (http://www. 
15 genethon.fr/genethon_en.html), the Marshfield Center for Medical Genetcs 
(http://www.marshmed.org/genetics/). the Whitehead Institute Genome 
Center (http://www-genome.wi.mit.edu/). GeneMa P 98, dbSTS. and dbEST 
(NCBI http://www.ncbi.nlm.nih.gov/), the Sanger Centre (http://www.sanger. 
acuk/) and the Stanford Human Genome Center (http://www- 
20 shgc.stanford.edu/). Maps were integrated manually to identify markers 
mapping to the disorder region. A list of markers is shown in Table 2. 

2 m=.h,„ Development : Sequences for existing STSs were 
obtained from the GDB, RHDB (http://www.ebi.ac.uk/RHdb/). or NCBI, and 
were used to pick primer pairs (overgos; see Table 2) for BAC .ibrary 
25 screening. Novel markers were developed from publicly available genom-c 
sequences, proprietary cDNA sequences, or from sequences derived from 
BAC insert ends (described below). Primers were chosen using a scnpt that 
automatically performs vector and repetitive sequence masking using 
CROSSMATCH (P. Green. Universfty of Washington). Subsequent pnmer 
30 selection was performed using a customized Rlemaker Pro database 
(http://www.f„emaker.com). Primers for use in PCR-based done 
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confirmation or radiation hybrid mapping (described below) were chosen 
using the program Prime* (Steve Rozen, Helen J. Skaletsky. 1996. 1997. 
httpJ/www-genome.wi.mit.edu/ g enome_software/=ther/primer3.html). 
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3 p g H ia tinn Hybrid fRHI Mapping : Radiation hybrid mapping 
was performed against the Genebridge4 panel (Gyapay et al., 1996, Hum. 
Mo/. Genef. 5:339^6) purchased from Research Genetics. Mapping was 
performed in order to ref.ne the chromosomal localization of genetic markers 
5 used in genotyping as well as to identify, confirm, and refine localizations of 
markers from proprietary sequences. Standard PCR procedures were used 
for typing the RH panel with markers of interest. 

Briefly, 10 ul PCR reactions contained 25 ng DNA of each of the 93 
Genebridge4 RH samples. PCR products were electrophoresed on 2% 
10 agarose gels (Sigma) containing 0.5 ug/ml ethidium bromide in 1 X TBE at 
150 volts for 45 min. Model A3-1 electrophoresis systems were used (Owl 
Scientific Products, Portsmouth, NH). Typically, gels contained 10 tiers of 
lanes with 50 wells/tier. Molecular weight markers (100 bp ladder, 
GibcoBRL, Rockville, MD) were loaded at both ends of the gel. 
15 images of the gels were captured with a Kodak DC40 CCD camera 

and processed with Kodak 1D software (www.kodak.com). The gel data 
were exported as tab delimited text files. The names of the files included 
information about the panel screened, the gel image files, and the marker 
screened. These data were automatically imported using a customized Peri 
20 script into Filemaker databases for data storage and analysis. The data 
were then automatically formatted and submitted to an internal server for 
linkage analysis to create a radiation hybrid map using RHMAPPER (L. 
Stein et al 1995; available from Whitehead Institute/MIT Center for 
Genome Research, at http://www.genome.wi.mit.edu/ftp/pub/software/ 
25 rhmapper/, and via anonymous ftp to ftp.genome. wi.mit.edu, in the directory 

/pub/software/rhmapper.) 

4 i ihr^n, Screening: The protocol used for BAC library 

screening was based on the "overgo- method, originally developed by John 
McPherson at Washington University in St. Louis (http://www.tree.ca.tech 
30 edu/ P rotoco.s/overgo.html. and W-W. Cai et a... 1998. Genomics 54:387- 
397) This method involved filling in the overhangs generated after 
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annealing two primers. Each primer was 22 nucleotides in length, and 
overlapped by 8 nucleotides. The resulting labeled product (36 bp) was 
then used in hybridization-based screening of high density grids derived" 
from the RPCI-11 BAC library (deJong. supra). Typically, 15 probes were 
pooled together to hybridize 12 filters (13.5 genome equivalents). 

Stock solutions (2 pM) of combined complementary oligos were 
heated at 80°C for 5 min, placed at 37°C for 10 min, and then stored on ice. 
Labeling reactions included the following: 1.0 pi H 2 0; 5 pi mixed oligos (2 
pM each); 0.5 pi BSA (2 mg/ml); 2 pi OLB (-A. - C, -N6) Solution (see 
below); 0.5 pi 32 P-dATP (3000 Ci/mmol); 0.5 pi 32 P-dCTP (3000 Ci/mmol); 
and 0.5 pi Klenow fragment (5 U/pl). The reaction was incubated at RT for 1 
hr and unincorporated nucleotides were removed using Sephadex G50 spin 
columns. Solution O: 1.25 M Tris-HCI, pH 8, and 125 M MgCI 2 ; Solution A: 
1 ml Solution O. 18 pi 2-mercaptoethanol, 5pl 0.1 M dTTP. and 5ul 0.1 M 
1 5 dGTP; Solution B: 2 M HEPES-NaOH, pH 6.6; Solution C: 3 mM Tris-HCI. 
pH 7.4, and 0.2 mM EDTA; Solutions A, B, and C were combined to a final 
ratio of 1:2.5:1 .5, and aliquots were stored at -20°C. 

High-density BAC library membranes were pre-wetted in 2 X SSC at 
58'C Filters were then drained slightly and placed in hybridization solution 
20 (1% BSA; 1 mM EDTA. pH 8.0; 7% SDS; and 0.5 M sodium phosphate), 
pre-warmed to 58°C, and incubated at 58°C for 2-4 hr. Typically, 6 filters 
were hybridized in each container. Ten milliliters of pre-hybridizat,on 
solution was removed, combined with the denatured overgo probes, and 
added back to the filters. Hybridization was performed overnight at 58°C. 
25 The hybridization solution was removed and filters were washed once in 2 X 
SSC 0.1% SDS. followed by a 30 min wash in the same solution at 58°C. 

u u i\ 1 t; y Qcp an d rj 1% SDS at 58°C for 30 
Filters were then washed in: 1) 1.5 X bbu ana u. i /o ou 

min- 2) 0.5 X SSC and 0.1% SDS at 58°C for 30 min; and in 3) 0.1 X SSC 

and 0 1% SDS at 58°C for 30 min. Filters were then wrapped in Saran 

30 Wrap®, and exposed to film overnight. To remove bound probe, filters were 

treated in 0.1 X SSC and 0.1% SDS pre-warmed to 95°C. and then cooled 
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to RT. Clone addresses were determined in accordance with instructions 

supplied by RPCI. 

To recover clonal SAC cultures from the library, a sample from the 
appropriate library well was plated by streaking onto LB agar (T. Maniatis et 
al., 1982. Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, Cold Spring Harbor, NY) containing 12.5 ug/ml chloramphenicol 
(Sigma), and plates were incubated overnight. A single colony and a portion 
of the initial streak quadrant were inoculated into in each well of a 96-well 
plate containing 400 ul LB plus chloramphenicol. Cultures were grown 
overnight at 37°C. For storage. 100 pi of 80% glycerol was added to each 
well, and the plates were placed at -80°C. 

To determine the marker content of clones, aliquots of the 96-well 
plate cultures were transferred to the surface of nylon filters (GeneScreen 
Plus, NEN) placed on LB/chloramphenicol petri plates. Colonies were 
grown overnight at 37«C and colony lysis was performed by placing filters on 
pools of: D 10% SDS for 3 min; 2) 0.5 N NaOH and 1 .5 M NaCI for 5 mm; 
and 3) 0.5 M Tris-HCI, pH 7.5, and 1 M NaCI for 5 min. Filters were then air- 
dried and washed free of debris in 2 X SSC for 1 hr. The filters were a.r- 
dried for at least 1 hr, and DNA was crosslinked linked to the membrane 
using standard conditions. Probe hybridization and filter washing were 
performed as described above for the primary library screening. Confirmed 
clones were stored in LB containing 15% glycerol. 

In certain cases, polymerase chain reaction (PCR) was used to 
confirm the marker content of clones. PCR conditions for each primer pair 
25 were optimized with respect to MgCI 2 concentration. The standard buffer 
contained 10 mM Tris-HCI (pH 8.3). 50 mM KCi. MgCl 2 , 0.2 mM each dNTP. 
0 2 uM each primer, 2.7 ng/u. human DNA. 0.25 U AmpliTaq (Perkin Elmer) 
and MgC. 2 concentrations of 1.0 mM, 1.5 mM, 2.0 mM or 2.4 mM. Cycl.ng 
conditions included an initial denaturation at 94°C for 2 min; 40 cycles at 
30 94°C for 15 sec, 55°C for 25 sec, and 72°C for 25 sec; and a final extens,on 
at 72°C for 3 min. Depending on the results, the conditions were further 
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optimized as required. For further optimization, the annealing temperature 
was increased to 58°C or 60°C, the cycle number was increased to 42. the 
annealing and extension times were increased to 30 sec. and/or 
AmpliTaqGold was used (Perkin Elmer). 

5 rap. nNA Preparation : Several different types of DNA 
preparation methods were used to isolate BAC DNA. The manual alkaline 
lysis miniprep protocol listed below (Maniatis et al., 1982) was successfully 
used for most applications, i.e.. restriction mapping. CHEF gel analysis, and 
FISH mapping, but this protocol was not reproducibly successful for 
endsequencing. The Autogen protocol described below was used to isolate 
BAC DNA for endsequencing. 

For manual alkaline lysis BAC minipreps. bacteria were grown in 15 
ml terrific broth (TB) containing 12.5 ug/ml chloramphenicol. Cultures were 
placed in a 50 ml conical tube at 37X for 20 hr with shaking at 300 rpm. 
Cultures were centrifuged in a Sorva.l RT 6000 D at 3000 rpm (1800 x g) at 
4°C for 15 min. The supernatant was aspirated as completely as poss.ble. 
In some cases, cell pellets were frozen at -20°C at this step for up to 2 
weeks The pellet was then vortexed to homogenize the cells and m.n.mize 
clumping. Following this. 250 ul of P1 solution (50 mM glucose. 15 mM Tns- 
HCI pH 8, 10 mM EDTA. and 100 ug/ml RNase A) was added. The m.xture 
was pipetted up and down to mix. The mixture was then transferred to a 2 
m, Eppendorf tube. Subsequently. 350 pi of P2 solution (0.2 N NaOH, 1% 
SDS) was added, mixed gently, and the mixture was incubated for 5 mm at 
RT Then 350 pi of P3 solution (3 M KOAc. P H 5.5) was added and m.xed 
25 gently until a white precipitate formed. The solution was incubated on ,ce for 
5 min. and then centrifuged at 4°C in a microfuge for 10 m.n. 

The supernatant was transferred carefully (avoiding the whrte 
precipitate) to a fresh 2 ml Eppendorf tube, and 0.9 ml of isopropano. was 
added The solution was mixed and left on ice for 5 min. The samples were 
30 centrifuged for 10 min. and the supernatant was carefully removed. Pellets 
were washed in 70% ethano. and air-dried for 5 min. Pellets were then 
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resuspended in 200 pi of TE8 (10 mM Tris-HC P H 8.0. 1.0 mM EDTA. pH 
8 0) and RNase (Boehringer Mannheim. http://btochem.boehnnger- 
mannheim.com) added to 100 pg/ml. Samples were incubated at 37"C for 
30 min. then precipitated by addition of NH 4 OAc to 0.5 M and 2 volumes of 
5 ethanol. Samples were then centrifuged for 10 min. and the pellets were 
washed with 70% ethanol. The pellets were air-dried and dissolved ,n 50 pi 
TE8 Typical yields for this DNA prep were 3-5 pg per 15 ml bactenal 
culture. Ten to 15 pi of DNA was used for EcoRI restriction analysis; 5 pi 
was used for Nod digestion and clone insert sizing by CHEF gel 

10 electrophoresis. 

Autogen 740 BAC DNA preparations were made by dispensmg 3 ml 
of LB media containing 12.5 pg/ml of chloramphenicol into autoclaved 
Autogen tubes. A single tube was used for each clone. For inoculation^ 
glycerol stocks were removed from -70°C storage and placed on dry 
1 5 small portion of the glycerol stock was removed from the original tube wrth a 
sterile toothpick and transferred into the Autogen tube. The toothp,ck was 
left in the Autogen tube for at least 2 min before discarding. After 
inoculation the tubes were covered with tape to ensure that the seal was 
tight When all samples were inoculated, the tubes were transferred ,nto an 
20 Autogen rack holder and placed into a rotary shaker. Cultures were 
incubated at 37°C for 16-17 hr at 250 rpm. 

Following this, standard conditions for BAC DNA preparatton. as 
def,ned by the manufacturer, were used to program the Autogen. However 
samples were no. dissolved in TE8 as part of the program. Instead. DNA 
25 pellets were left dry. When the program was completed, the tubes were 
re moved from the output tray and 30 p, of sterile distilled and 
was added directiy to the bottom of the tube. The tubes were the gentty 
shaken for 2-5 sec and then covered with paraf.lm and incubated at RT for 
1-3 hr DNA samples were then transferred to an Eppendorf tube and used 
30 either directly for sequencing or stored at 4°C for later use. 
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6 BA££!s!^Stiaia£teraatien: DNA samples prepared etther by 
ma n U al a.ka.ine lysis or the Autogen protocol were digested with EcoR. for 
analysis of restriction fragment sizes. These data were used to compare the 
extent of overfap among clones. Typically 1-2 pg DNA was used or each 
.action. Reaction mixtures included: 1 X Buffer 2 (NEB); 1 mg/rn BSA 
(NEBV 50 pg/ml RNase A (Boehringer-Mannheim); and 20 U EcoRI (NEB) 
In a final volume of 25 p.. Digestions were incubated at 37'C for 4-6 hr. 
BAG DNA was also digested with Nod for estimation of insert size by CHEF 
gel analysis (see below). Reaction conditions were identical to those for the 
LoRI digestion, except that 20 U Nod were used. Six microliters o 6 X 
Hcoll loading buffer containing bromphenol blue and xylene cyanol was 
added prior to electrophoresis. 

EcoRI digests were analyzed on 0.6% agarose gels (Seakem. FMC 
Bioproducts, Rockland. ME) in 1 X TBE containing 0.5 pg/ml ethWiurn 
bromide Gels (20 cm x 25 cm) were electrophoresed ,n a Model A4 
elt phoresis nit (Owl Scientific, a, 50 volts for 20-24 hr. Molecu ar 
J * size markers included undigested lambda DNA. HindU. dusted 
,ambda DNA, and HaelH digested X174 DNA. Molecular ^ we,ghl ^ 
were heated at 65*C for 2 min prior to loading the ge . .mages -re 
captured with a Kodak DC40 CCD camera and analyzed with Kodak 1D 
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digests were analyzed on a CHEF DRH 0»~0 
e,eotrophoresis un« according to the manufactured ^ 
Bri efiy 1% agarose gels (Bio-Rad pulsed field grade, were prepare^ ^ 
X TBE equi.ibra.ed for 30 min in the electrophoresis un* at 14 C and 

Icfiophoresed at 6 vo,ts/cm for 14 hr with circulation. Switching times 
electrophoresed ^ ^ electrophoresls 

were ramped from 10 sector sec. m!irV prs included 

in 0 5 pg/m. ethidium bromide. Molecular weight markers ,nc.uded 
tested lambda DNA, M digested lambda DNA, lambda ladder PFG 
30 ladder, and low range PFG marker (all from NEB). 
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7 BAC EjTdseauencing -. The sequence of BAC insert ends 
utilized DNA prepared by either of the two methods described above. The 
ends of BAC clones were sequenced for the purpose of filling gaps ,n the 
physical map and for gene discovery information. The following vector 
5 primers specific to the BAC vector P BACe3.6 were used to generate 
endsequence from BAC clones: pBAC 5-2 (TGT AGO ACT ATA TTG CTC; 
SEQ ID NO: ) and pBAC 3'-1 (CGA CAT TTA GGT GAC ACT; SEQ ID NO: 

The ABI dye-terminator sequencing protocol was used to set up 
10 sequencing reactions for 96 clones. The BigDye (ABI; PE Applied 
Biosystems) Terminator Ready Reaction Mix with AmpliTaq" FS. Part 
number 4303151. was used for sequencing with fluorescently labeled 
dideoxy nucleotides. A master sequencing mix was prepared for each 
primer reaction set. and included: 1600 pi of BigDye terminator m,x (ABI. 
15 PE Applied Biosystems); 800 pi of 5 X CSA buffer (ABI; PE Applied 
Biosystems); and 800 pi of primer (either pBAC 5'-2 or pBAC 3-1 at 3.2 
p M ) The sequencing cocktail was vortexed to ensure it was well-mixed and 
32 pi was aliquoted into each PCR tube. Eight microliters of the Autogen 
DNA for each clone was transferred from the DNA source plate to a 
20 corresponding we,, of the PCR plate. The PCR plates were sealed fightly 
and centred briefly to collect all the reagents. Cycling oondiflons were as 
follows: 1) 95-C for 5 min; 2) 95"C for 30 sec; 3) SOX for 20 sec; 4) 65 C 
for 4 min; 5) steps 2 through 4 were repeated 74 times; and 6) samples 
were stored at 4°C. 

25 At the end of the sequencing reaction, the plates were removed from 

,he thermocycler and centrifuged briefly. Centri-Sep 96 plates were then 
used according to manufactured recommendations to remove 
unincorporated nucleotides, salts, and excess primers. Each sample was 

30 onto ABI 377 Fluorescent Sequencers. The resulting end sequences were 
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gaps The end sequences were also analyzed by BLASTN to idenWy EST 
or gene content. The BAG end sequences conespond to SEQ ID NO:156 to 
SEQ ID NO'693, disclosed herein. 

FY AMPLE 5: RimCLONING »"n SFOUENCING OF BACs 
FROM 12a23-qter 

The physical map of the chromosome 12 region provided a set of 
BAC clones for use as sequencing templates (see Figures 5A-51). BAC 
DNA was isolated according by a QIAGEN purification (QIAGEN, Inc.. 
Valencia. CA. per manufacturer's instructions) or a manual purification. The 
manual purification method was a modification of the standard alkaline 
lysis/cesium chloride preparation for plasmid DNA (see e.g.. F.M. Ausubel et 
al„ 1997. Current Protocols in Molecular Biology, John Wiley & Sons. New 
York. NY). 

Briefly, for manual purification, cells were pelleted, and resuspended 
in GTE (50 mM glucose, 25 mM Tris-CI (pH 8), and 10 mM EDTA) and 
lysozyme (50 mg/ml solution). This was followed by addition of NaOH/SDS 
(1% SDS and 0.2N NaOH) and then an ice-cold solution of 3M KOAc (pH 
4 5.4 8) RnaseA was added to the filtered supernatant, followed by 
Proteinase K and 20% SDS. The DNA was precipitated with isopropanol. 
, and then dried, and resuspended in TE (10 mM Tris. 1 mM EDTA (pH BX»>. 
The BAC DNA was further purified by cesium chloride density-grad,ent 
centrifugation (Ausubel et a... 1997). Following isolation, the BAC DNA was 
hydrodynamically sheared using HPLC (Hengen e, al.. 1997, Iron* 
Lnom. SC. 22:273-274) to an insert size of 2000-3000 bp^ Afler 
5 shearing, the DNA war concentrated and separated on a standard 1 A 
agarose gel. A single fraction, corresponding to the approximate s^was 
excised from the gel and purified by electrocution (Sambrook et a... 1989V 

The purified DNA fragments were then blunt-ended using T4 DNA 
polymerase. The blunt-ended DNA was then ligated to unique Banker 
50 adapters (5 GTCTTCACCACGGGG (SEQ ID NO: ) and 5 
GTGGTGAAGAC (SEQ ID NO: ) in 100-1000 told molar excess. These 
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adapters were complimentary to the BsfXI-cut pMPX vector, whereas the 
BsfXI-cut vector was not setf-complimentary. Therefore, the adapters would 
not concatemerize. and the cut vector would not ligate to itself. The hnker- 
adapted inserts were separated from unincorporated linkers on a 1% 
5 agarose gel and purified using GeneClean (BIO 101 , Inc.. Vista. CA). The 
Hnker-adapted insert was then ligated to a modified pBlueScript vector to 
construct a "shotgun" subclone library. The vector contained an out-of- 
frame lacZ gene at the cloning site, which became in-frame in the event that 
an adapter-dimer was cloned. Such adapter-dimer clones gave rise to blue 
10 colonies, which were avoided. 

Sequencing was performed using AB!377 automated DMA 
sequencing methods. Major modifications to the protocols are highl,ghted 
as follows. Briefly, the library was transformed into DH5-competent cells 
(GibcoBRL, DHS-transformation protocol). Transformed cells were plated 
, 5 onto antibiotic plates containing ampicillin and IPTG/X-gal. The plates were 
incubated overnight at STC. White colonies were identified, and plated to 
obtain individual clones for sequencing. Cultures were grown overnight a 
37-C DNA was purified using a silica bead DNA preparation method Ng et 
a... 1996. Nad. Adds Res., 24:5045-5047). In this manner. 25 pg of DNA 

20 was obtained per clone. 

Purified DNA samples were sequenced using ABI dye-term.nator 
chemistry. The ABI dye terminator sequence reads were run on ABI377 
machine, and the data were directly transferred to ^ UNIX machme 
fonowing lane tracking of the gels. All reads were assembled us ng PHRAP 
(P Green. Absfracfs of DOE Human Genome Program Contracfor-Grenfee 
***** V, Jan. 1996. p.167) with default parameters and 
Each BAC was sequenced for -3 X coverage. SEQ ID NOs for assembled 
contigs are shown in Table 3A. below. 
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TABLE 3A: BAC SEQUENCES 

{Genomic Sequent |SEQ ID NU: Kanye J 
jRP11-666B20 
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RP11-702C13 


766-808 | 


RP11-723P10 


809-869 


RP11-831E18 


(670-899 


RP11-899A17 


900-927 


RP11-932D22 


B28-978 



Additional BAC sequences (GenBank (www.ncbi.nlm.nih.gov)) were 
also investigated as potentially containing gene or gene(s) involved m 
asthma and related diseases thereof. 

TABLE 3B: BAC SEQUENCES 



genomic Sequence S 


co in NO - I 


kC003982 a 




Uc011216 b 




Uc023437 a 


OR 


Uc024021 0 


Q7 

a/ 


I&C024642 c 


98 


kC025641 : 


99 


Uc025837 < 


'00 


Lc026331 


'01 


Uc026333 


702 


Uc026336 


703 


Lc026764 


705 


kc026869 


704 


Uc048337 


706 


Uc063926 


707 


Lc069209 


708 


Lc073527 


709 


Uc073862 


710 


Uc073912 


711 


Lc073930 


712 


Lc078925 


713 


Uc078926 


714 


Lc079031 


715 


Uc079602 


716 _^ 


Uc090147 


717 


Lc090565 


718 


[Z98941 


979 
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, i^n^Zf^^ 

sequences corresponding ,o gene fragments in public databases (GenBanK 
nd buman dbEST, and proletary cDNA sequences (IMAGE — rn 
and direct selected cDNAs) were masKed for repetitive seguences and 
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clustered using the PANGEA Systems EST clustering tool (DoubleTw,st. 
Oakland. CA). The clustered sequences were then subjected to 
computational analysis to identify regions bearing similarity to known genes. 
This protocol included the following steps: 

5 a The clustered sequences were compared to the publ.cly 

available UniGene database (NCBI) using the BLASTN2 a.gorithm (Altschul 
etal 1997) The parameters for this search were: E = 0.05. v = 50, B - 50, 
where E was the expected probability score cutoff. V was the number of 
database entries returned in the reporting of the results, and B was the 

10 number of sequence alignments returned in the reporting of the results 

(Altschul etal., 1990). 

b The clustered sequences were compared to the GenBank 
database (NCBI) using BLASTN2 (Altschul et a.., 1997). The parameters for 
,nis search were E=0.05. V=50. B= 50. where E. V. and B were defined as 

15 ab0V6 'c The clustered sequences were translated into protein 
sequences for all six reading frames, and the protein sequences were 
compared to a non-redundant protein database compiled from GenPapt. 
SWISSPROT, and PIP. (NCBI). The parameters for this search were E - 
0 05 V = 50. B = 50, where E. V. and B were defined as above. 
' ' d The clustered sequences were compared to BAC sequences 
(see below) using BLASTN2 (Altschul et a.„ 1997). The parameters for th,s 



20 

d. 



above. 



2 - rr - Idejtfflcatien from BAC Gej^Sg^ence: Following 
assembly o, the BAC sequences into contigs. the contigs were subjected £ 
computational analyses to identify coding regions and reg,ons beanng DNA 
Tequence similarity to known genes. This protocol included the fo„ow,ng 

30 S,ePS: a Contigs were degapped. The contig sequences contained 
symbols that represented locations where the individual AB, sequence reads 
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had insertions or deletions (denoted by periods). Prior to automated 
computational analysis of the oontigs. the periods were removed. The 
original cor.tig sequences were held tor future reference. 

b BAC vector sequences were masked within the sequence by 
using the program CROSSMATCH (P. Green. httpAVchimera.biotecr, 
washington.eduVUWGC). Shotgun library construction (detaHec .above) left 
BAC vector sequences in the shotgun libraries. The CROSSMATCH 
program was used to compare the sequence of the BAC contigs to the BAC 
vector and to mask any vector sequence prior to subsequent steps. Masked 
3 sequences were marked by "Xs" in the sequence files, and were omrtted 

during subsequent analyses. 

c E coli sequences contaminating the BAC sequences were 
masked by comparing the BAC contigs to the entire E coKgenonv, 

d Repetitive elements known to be common ,n the human 
l5 genome were masked using CROSSMATCH (P. Green. University erf 
Washington,. In this implementation of CROSSMATCH the BAC seque^ 
was compared to a database of human repetitive elements (J. J-* 
Genetic Information Research Institute. Palo Mo. CA). The 
w ere marked by W in the sequence files, and were omrtted dunng 

20 subsequent analyses. 

e The location of exons within the sequence was pred,cted usrng 
the MZEF computer program (Zhang, 1997, Proa Net,. Acad. Sc/ 94:565- 
568) and GenScan gene prediction program (Burge and Karirn, J. Mo/. **. 

25 268: T The sequence was compared to the publidy available 
UniGene database (NCBI) using the « £ J-™ ^ 
1997) The parameters for this search were: E-0.05. V 50. b . 
iyy^- '" e h« number of database 

was the expected probability score cutoff, V was the numu 
ZZ rluLd inthe reporting of the resuU, and B was the number C 
30 sequence alignments returned in the reporting of the results (Altschu, a... 
1990). 
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a The nucleotide sequence was translated into amino acid 
fences for all six reading frames, and the amino - >™ 
compared to a non-redundant protein database comp,led from GenP. r f. 
SSrOT. and PIR (NCBI). Tne parameters for this search were 
E=0 05 V=50, B=50. where E, V. and B were defined as above. 

' n The BAC DNA sequence was compared to a database 
clustered sequences using the BLASTN2 algorithm (Altschul et al. 1997)^ 
c reu H . „„ p_ n ns V=50 B=50. where E, V. and 

The parameters for this search were E-0.05. 

B were defined as above. The database of clustered sequences was 
repal utilizing a propriety Custenng technology (PANGEA^ystems, 

L ) The dustenng program compiled cDNA clones denved from direct 
e c«on experiments (described below,, human dbEST sequence 

dng to the 12q23-ter region, proprietary cDNAs, GenBanK genes, and 

IMAGE _ compared to the BAC end sequences 

from the 12q23-,er region using the BLAST* J Jo-Jj (-chul -U 
1997). The parameters for this search were E-0.05, V 50. b 

V and B were defined as above. Hotahase 
, The BAC sequence was compared to the GenBank database 
,v ■ ,h» BLASTN2 algorithm (Altschul et al.. 1997). The 
, (NCBI) us.ng the BLASTN2 ag B=50, where E, V. and B 

parameters for this search were E-0.05. V 5U. a 
were defined as abov. ^ ^ ^ ^ ^ 

CenBank database (NCBI) using the ^ JJ-JJ = - - 
B 199 7). The parameters for this search were E-0.05. V 50. 

V. and B were defined as above. ^ 
,. The BAC sequence was compared P 

T»n (EST) GenBank database (NCBI) usmg the BLASTN2 
Sequence Tag (EST) GenB ^ ^ ^ ^ ^ 

algorithm Altschul et al., 1997). me P .„ ahove 

30 e! 0 .05. V=50. B. 50, where E, V, and B were defined as above. 
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m The exon prediction programs MZEF (Zhang. 1987. Pr°c. Natl. 
Acad. SdL USA 94:565-568) and GenScan (Surge and Kariin. J. Mo/. B,oL 
268 78-94) were also utilized to help identify the exons . 

' The results of BLAST searches of protein and nucleotide databases 
are summarized in Table 4. Column 1 lists the gene names, and column 
lists the types of sequences (i.e.. Gene. Express Sequence Tag (EST). etc.V 
Columns^ and 4 lis, the SEQ ID NOs for the nucleotide and am.no ac* 
sequences, respectively. Column 5 lists the GenBanx access™ numbers 
Co in 6 lists the descriptions o, the genes or ESTs relating to potent, I 
"ns Using this information, one of ordinary sKU, in the ar, ,s able to 
™*e the roles of these genes and their relation to the d.sorders 
Tscribed herein. The seven, column lists the genet— and » 
eiqhth column lists the corresponding BAC clones. The SEQ ID NOs 
Ending to the BAC clones are shown in Tables 3A and 3B = 

shou ,d be noted that ^^^^^ 

herein using both short (e.g., 561.1. 561.2. etc.. 

long (e.g., 661.1*1. B61.M2; see Example 14) nomenclature. 
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cloned cDNA libraries *» norma, lung and bronchi e^um ~ 
5 constructed using standard methods (Scares et a... 1994. Autom 

Sequencing and Ana/ys*. Adams e. a., (eds). Academ,c Press, NY. pp. 110 
ieq a , . OK1A . were extracted from tissue or cells by 

tctal/cytoplasmic RNA using dynabeads-dT accordmg to 
recommendations <DynaUnc..h«P« 

cDNA was tben ligated into tbe ^ ^^^ was 

„ (Stratagene. ^^^ s ^'Lporaticn (Scares, 
formed into E. co„ host DH10B or DH1 y ^ ^ £ 

1994 , Fd^a^Q^-^^^ for the Mega . prep kft 
eo/i colonies after scrapmg the plates 

(QIAGEN). The gua,*y of the cDNA libraries was — V - ■ 
20 portion c, the total number of prima. . ^ JJ^ 

•™r+ ci7P and calcu ating the percentage ui ^ 
average insert size, ana » ^ 

CDNA insert. AddHional cDNA libranes (human toU, bra n, 
Oocyte, and fetal brain) were purchased from Lrfe 
(Bethesda. MD). nexam er-pnmed. were 

* «o^h nf the cDNA libranes were preparea as> 
10 x 10 arrays of eatf of the , cDN transform ants. The 

cDNA libranes were Wared to 2.5 x 10 * 0 P 2 ^ of 

appropriate volume of frozen stock was used to 
30 uLampi* ,100 pg W Four hundred +£££L about 5000 

cfu (colony forming units). The tuDes w« 
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«n ,n7 0 9 was obtained. Frozan stocks were prepared 
shaking until an OD of 0.7-0.9 was o ^ rf 8Q% 

for each o, the cultures by aliguo^ and s , ored a t -70-C: 

^r.TJ^s^ -* - q,agen sp,n mini - 

DNA was isolated from the rema.n.ng from the 

prep k,t according to the — ^ ^ ^ Ma rkers were 
400 cultures were pooled to make 

de signed to amplify putative axons from Canada e genes^ ^ 
— was used to screen the 

"Z C the presence of nO N A olones 

arrayed library, r . prR „ sing the same markers. 

were confirmed by a second PGR us,ng th ^ ^ 

nMA lihrarv was dentified as HKeiy w ^ 
Once a cDNA library w critjeal ^ rt 

corresponding to a transcnpt of ' ntereS < inserts . This was 

was used to isolate a clone or clones contain g^ ^ 
acc omplished by a mod^on ^ LB ,us ampi* 

(Sambrook et a, 1989V J^'J^ libra , Co.onies were 
a9ar piates were were then transferred to ny.on 

allowed to grow overnight at 3f u. equiva | e nf) and duplicates 

— - ~ IT eLntialW as described 

prepared by press.ng ^ ^ ^ was then incubated an 
(Sa mbrook e, a,.. ^^ Jl^lLa. time to grow. The DNA 
.dditionel 6-8 hr to allow the colonies a jncubatjng 
from the becteri, colonies was then bound to the y ^ ^ ^ g ^ ^ 
tne f , ners with denaturing solurion (0 5 N . ^ ^ ^ 2 ^ ^ 
neutralization solution (0.5 M Tns-CI P . . ■ ^ g 

The bacteria, colonies were * J sue pape , The filters 

o, 2 X SSC/2% SDS for 1 mm wh,.e rubb, g B- 
^ re airbed and baked under vacuum a, 80 C for 

DNA to the filters. nrpn ared by random hexamer 

cDNA hybridization probes were ^ ^ , M For 
lab eling (Fineberg end Voge.stem. 1983. Ana/. 
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ma .l fragments probes were prepared using gene-specific primers and 
small fragments, p membranes were 

omitting random hewers - Mh. re otto ^ q 1% ^ . 

pre-wasbed in 10 mM Tns-CI pH 8.0. M N. pre . hybridize d 
for 30 min at WC. Following tbe pre-wash. .be Ti ters were P y 
in mo re .ban 2 mimiter o, 6 X SSC. 50% deionized .ormamide. 2A SDS£X 
Zardfs solufion, and 100 mg/ml denafored salmon sperm DNA. - « C 
for 30 min Tbe fitters were .ben transferred to hybndizafion solution (6 X 
SSC 2% SDS, 5 X DenhardVs. and 100 mg/m, denatured salmon sperm 
DN A> containing denatured ^P-dCTP-labeled cDNA probe, and .ncubated 

overnight at 42'C. under comMi 

The following morning, the filters were x 
,■ n 2 X SSC/2% SDS a, RT (room temperature, for 20 min, followed 

screening. Secondary screening wa* colon ies, so that 

the isolated clone. nirprt cPNA 

a Gene Mdentific^^^ 

^tion o, 9 enes mapping to a pa^r , tew ^ ^ 
nybr idizing genomic DNA (in this case, BACs from « 
pools of cDNAs derived from various fissue sou^ T e P 

of cDNA libraries. The tissues uwu 
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cells Th2 cells stimulated with TP A, bronchial smooth muscle cells, 
unstimulated ThO cells. ThO stimulated with anti CD3 and TPA, pulmonary 
artery endothelium cells, lung microvascular endothelial cells, bronch,al 
epithelium cells, normal and asthmatic lung, small airway epithelium cells 
pulmonary artery smooth muscle cells, and lung fibroblasts. These cell 
types have been implicated in the pathophysiology of asthma and were 
expected to express genes involved in the asthmatic inflammatory response, 
in addition. RNA isolated from brain cells was used, because brain cells 
expresses a diverse array of genes. 

Cytoplasmic RNA was isolated as described by Sambrook et al. 
1989 Mo/ecu/ar Coning: A Laboratory Manual, Cold Spring Harbor 
Laboratories. Cold Spring Harbor. NY. Approximately 400-600 p g of 
cytoplasmic RNA was isolated from 50 million cells. Total RNA was isolated 
from normal and asthmatic lung tissue using TR.zol Reagents (GibcoBRL), 
which are ready-to-use monophasic solutions of guanadinium ^.ocyanate 
and phenol (P. Chomczynski and N. Sacchi. 1987, Anal. Biocnem. 162:166- 
159- P. Chomczynski et a... 1987, J. N1H Res. 6:83; D. Simms et a... 1993 
Focus 15 99; P. Chomczynski. 1993. BioTeonniaues 15:532). Five hundred 
milligrams of frozen tissue was crushed into a fine powder using a Bessman 
tissue pulverizer (Fisher Science). The TR.zol Reagents were m,xed wrth 
the crushed tissue according to the manufacturer's recommendations. 

To ascertain whether there was genomic DNA or heteronuclear RNA 

■ orp »nd RT/PCR were performed. PCR analysis was 

contamination. PCR and Rl/Ki-K were p 

performed using primers (Research Genetics) that amplrfied STS rakers 
25 rom chromosomes 2 (D2S2358). 7 (D7S2776 and D7S685). 10 (D10S228 
and D10S1755). and 20 (D20S905 and D20S95). AI, PCR reason .were 
performed in a f.na, volume of 25 p.. containing 1 p. of RNA. 10 mM Tns-HCI 
JpH 8 3). 50 mM KC. 1.5 mM MgCI, 0.001% gelatin. 200 mM ea* dNTPs, 
P 0 PM o, each prfmer. and 1 U Tag DNA polymerase (Per.n Brno* A 
30 Perfdn Elmer 9600 cycler was used for amplification as «~ K >-* 
94-C. 30 sec a. 55'C. and 30 sec a. 72°C for 30 cycles. RT/PCR analys,s 
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was performed using me Superscript One-Step RT-PCR 

BRL, Rockville, MD) according to the manufactured reoornmendat.n, A 

PCR and RT/PCR products were evaluated by electrophoresrs on a U 

. a9arOS poT y ' (A) + RNA was prepared from the tota, RNA isolated from the 
human primary ce>ls and lung tissues using Dynabeads Oligo(dT, according 
to ^ manufactured recommendations (Dynal, Lake Success^ NY, 
Approximate* 4 pg o, messenger RNA was isolated from 1« « - «- 
Z for each ce„ type and tissue source. Tota, RNA isolated from b» n 
l0 «ssue was purchased from CLONTECH (Paio A.to, CaHfomia) and poW£ 
RNA was p epared from this materia, using Dynabeads descnbed 
Tve O go dT and random primed cDNA poois were generated from the 
mRNA isolated from each ceU type and tissue source. Briefly, 2. pg mRNA 
was mixed with o,igo(dT) primer in one reaction, in 
, 5 mRNA was mixed with random hexamers. and converted to 

, -n.arv DNA using the Superscript Choice System for cDNA 
ScoTrJ RoUe. MD, according to the manufacture, 

r6C0m r r,nt paired phosphorated cONA linkers (Table *> - 
20 anneaied by mixing a 1:1 ratio of the paired iinkers (10 pg each . ,nou a ng 
IT mixture at 65'C for 5 min, and ailowing the mixture to coo, to RT for 30 
I^rarneaied linkers were ligated to the oiigoCdJ , and ra = e 
cDNA poois from various tissue and ceii sources (Table 5) accord, g 
m anl rt ure,s instructions (GibcoBRL). The linker seguence provided a tag 
25 to identify the RNA from the particular cell types. 
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The cDNA pools were evaluated for length distribution by PCR 
1 „i nf a 1-1 1 10 and 1:100 dilution of the ligation 
amplification using 1 pi of a 1.1. « ..„«,<: „i " 

reaction A.I PCR reactions were performed in a final volume of 25 »l 
fining 1 PI o, DNA, 10 mM Tri,HC, <pH 8.3). SO mM KC, . mM 

Mg C„, 0.001 % gelat, ^^; p 2 M s^ZZ^ 
U Taq DNA polymerase (Perkin Elmer). A PerKin timer 

to for amplication as foflows: 30 seconds a, 94'C. 30 seconds a, 
To and 2 minutes at 72'C for 30 cycles. The length distribution of the 
amieu cDNA pools was evaluated by electrophoresis on a 1% agarose 
0 7 Tbe PCR reaction that gave the best representation o he , random 
primed and oligo dT primed cDNA poo.s was sca,ed-up to f"£»j£ 

t, j « i y ppr reaction for the siamny 
each cDNA pool. This represented a 1 X PCR react.o 

CDNA Twtnty BACs (Tab,e 6) that spanned the 15 cM chtical region 
\ H19^1609 and D12S357 were pooled in equ.molar 

15 ^ DNA was — 

r re-UTP .V nil translation in accordance with the 
instructions (Boehhnger-M—. The incorpora.on o ^ was 
rnonitored by standard methods (Del Mastro and Lovett, 1996, 
20 Molecular Biology, Humana Press Inc., NJ). 

TABLE 6: BACs SPANNING THE 15 cM REGION 



0753B07 



0666B20 



0687F10 



0820N16 



0899A17 



0716110 



Q839D11 



Q894M06 
0696L08 



0979G13 



0723P10 



0932D22 



0825K21 
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0866B05 

0750I23 

0831E18 



0761 L21 
0702C13 



Q739N03_ 



1064109 
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Direct cDNA selection was performed using standard methods (Del 
Mastro and Lovett, 1996. Methods in Molecular Biology 
m» Briefly 1 ug of each cDNA pool was placed into .nd.v.dual PGR tubes. 
A tota f 0 d ect selection experiments were arrayed onto a PGR plat. 
Su r sion of high copy repeats, ribosoma, RNA. and p.asm.d DNA « the 
Suppression u a nun dred nanograms of 

CDNA pools was performed to a Cot*. One nund 

— d BAC ^ was m iX ed ^J^J^ 

sr^r^-£-i- pa— ?z 

appropriate primers (shown »n Table 5). and a seco 

selection was performed. h 
CTP Binding Nuclear Protein RAN (TC4. a gene that maps w,th.n 

, 6 JZ^Z, was used to ^ 

surfing, r^^'^rn. random pnmed product o, me second 
T:lZeZ (me secondary selected materfa.) from lung 
" ndlet. Jhs ThO/uns» i mu l atedce..s..ungr 1 brob,astce. l s. 

rSSr^ a- endo— - ; = 

- — eP^um ce ^^'^ „as PC, 
wrth TPA. and oligo dT pnmed ThC ce.ls^ m ^ ^ 

amplified with modffied pnmers (Table 7. below), 
for two rounds of direct cDNA selection. 
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TABLE 7: MODIFIED OLIGONUCLEOTIDES 

The ampitfed mateda, was cioned into the UDG vector pAMPIO 
(GibcoBRL) in accordance with the manufacturer's recommendatons. Fou 
IdTed a d eighty Cones were picked from each transformed source and 

:::::::: — r r 

:re »^rjrjr.^ r= - 

containing 1 P 9 of Cot, DNA and W ^ 

— r ^Sngnrl — , Coid Spdng Harho, 
) Laboratory Manual, Cold Spring nam 1% SDS ) at 

NY) . ^^-^^^^i'Si dupiicate 
6 ,C. and were auto-,og^ ^ ^ ^ 

signals were scored as bacKgroun twentv-three 96-well 

we re re-arra y ed into « — P= _ 

seguence* Th s n^dec Mh e ^ ^ for ^ ^ ^ 

:::rr; P :r - on e r ^ rr - ™ = 

30 S^nTiTi- „ - ABi - automated fluorescence 
sequencer (Applied Biosystems). 
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Clones representing other contaminants, such as high copy repeats, 
ribosomai RNA. piasmid DNA. mtochondria, DNA. and E. co« and yeast 
DNA that were not identified in the hybridization process were removed from 
the dataset using in sffico methods. This produced a set of cDNA Cones 
corresponding to SEQ ID NO.980 to SEQ ID NO:1766. disclosed here,n. 
These clones were clustered using PANGEA System's EST Clustering Too, 
(Oakland. CA), and anaiyzed w«h BLASTN, BLASTX. and FASTA 
programs. This allowed the assembly of full-length gene sequences. The 
direct selected clones were combined wtth the ESTs homologous to BAC 
sequences. BAC end sequences, and sequence within the public domain 
(dbEST and GenBank). and then clustered using the PANGEA Systems 
EST Clustering Tool. The clustered sequences (i.e., consensus sequences) 
correspond to SEQ ID NO:1767 to SEQ ID NO:4687, disclosed herein In 
*o and hybridization techniques were used to map the 
cDNAs to the 15 cM region. Using well-established sequencing technique^ 
one skilled in the art could extend these candidate clones to map back 
region into a full-length gene, 
r rftwiri r n- F "P° ccgir>M analysis 

ln order to characterize the expression of genes mapping to the 

, I2q23-qter region, a series of experiments were performed. Fnt 
oligonucleotide primers were designed for PCR and RT-PCR reactions to 
amplrfy cDNA, EST, or genomic DNA could be amplified from a poo. of DNA 
mo els or RNA population. The PCR primers were used ,n a re ct„n 
containing genomic DNA to vemy that they generated a produ* o the 
5 " dieted sL, based on the genomic sequence. The .ength, ,n nudeobdes. 
5 preaiciea sre, (mRNA) was determined by 

of the processed transcript or messenger RNA (mRNA) w 

, • /Qsamhrook et al 1989, Molecular Cloning: A Laboratory 
Northern analysis (SambrooK et ai, .»o*. 

Manua/, Cold Spring Harbor Laboratory, Cold Spnng Harbor NY). Probes 
were generated using one of the methods described belov. 

Briefly sequence verified IMAGE consortium cDNA dones we« 
30 digested 7l appropriate restriction endonucleases to release the ,nse. 
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The restriction digest was electrophoresed on an agarose gel and the bands 
containing the insert were exoised. The gel piece containing the DNA msert 
was placed in a Spin-X (Corning Costar Corporation, Cambridge. MA) or 
Supeico spin column (Supeico Park. PA) and spun at high speed for 15 mm. 
The DNA was ethanol precipitated and resuspended in TE. Alternatively. 
PCR products obtained from genomic DNA or RT-PCR were purified as 
described above. Inserts purified from IMAGE clones were random pnmer 
.abeled (Feinberg and Vogelstein) to generate probes for hybridizaUon. 
Probes from purified PCR products were generated by incorporate of a- 
"P-dCTP in second round of PCR. Commercially available Multiple Tissue 
Northern blots (CLONTECH. Palo Alto. CA) were hybridized and washed 
under conditions recommended by the manufacturer. 

Figures 6A-6U show Northern blots illustrating the expression of the 
indicated genes in various tissues. W*h the excepts of Gene 214 (Figure 
6A) all blots were Multiple Tissue Northern Blots (CLONTECH. Palo Alto 
CA)' The tissues included: 1) brain; 2) heart; 3) skeletal muscle; 4) colon; 5 
thymus; 6, spleen; 7, kidney; 8) liver, 9) small intestine; 10, placenta; M) 
lung; and 12) peripheral blood leukocytes. Size standards W « 
indicated to the left of each blot. Figure 6A shows the Northern blo «o 
, Gene 214. which includes poly (A) + selected RNA from 1) a iymphoblas. cel. 
line from an asthmatic individual; 2) lung; and 3) trachea. 

RT-PCR was used as an alternate method to Northern blotting to 
oetec, mRNAs with low levels of expression. Total ^ 

human tissu purchased from CLONTECH (Palo Ato. CA), and 

5 Lrnic DNA was removed by DNase. digestion. The Supe^cnpf 
I Z .cation System for First strand cDNA synthesis (^echnolog^ 
CaKhlurg. MD) was used according to manufacturers d,re*on^ 
0 ,ioo(dT) or random hexamers to synthesize cDNA from the DNasel fijated 
:r R NA. Gene specie primers were used to amp the fa.et cDNAs , 
,n a 30 Ml PCR reaction containing 0.5 p. of first strand cDNA, 1 u sense 
primer (10 pM). 1 pi antisense primer (10 pM), 3 pi dNTPs (2 mM), 1.2 pi 
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Mod (25 nM) 3 pi 10X PCR buffer, and 1 U Taq Polymerase (Perkin 

r; ( - 1~ «- ■ — - - ^ m i 

followed by 30 oyo.es at 94'C for 30 seo. 58'C for 1 m,n, and 72 C for 1 mm. 
STJLJ -P at 72'C for 7 ml, PCR products were analyzed on 

a9arOS ;ri2 q 23^er genes are shown In Tabfo 4; foe nu d eo«de 
sequences correspond to SEQ » NO* to SEQ ID NOS2. foe enoo^ed 

«„h to <;fo ID N093-155, and the BAO 

am ino acid sequences 

nucleotide sequences correspond to SEQ ID NO.by* xo 
disclosed herein. 

EXAMPLE9: MVT ATtr>M ANALYSIS 

~ l order to »nduot mufotion^alysls. foe genomic structure of Gene 
2,4. Gene 224. Gene 422. Gene 436, Gene 440. Gene 454. Gene ^ 
G ene 561. Gene 570. Gene 581. Gene 698. Gene ~ «~ « °£ 
748 Gene 751 Gene 757 and Gene 848 was deterrmned. For genes wrth 

— — , — - z= 

compared to genomic sequence from the BACs. me p 
:;is were determined based on foe c— 
potions. Tbe exon prediction programs MZEF W 1«7. 
Acad Sc/„ 94:565-568) and GenScan (Burge and Karf.n. 1997. J. Mo 

analysis methods descnoea ana | vs is was used to 

determine nucleotide sequence vanants 88CP «*- 

• _.• -a ~\ nwA *eauences for vanants. Bnefly, K^rc 
scre en — D " ated asthmatic individuals that showed 
ge nerate temp * from unrelate ^ ^ ^ 

increased shanng for the ^ 
towards linkage. Non-asthmatic ind.v.duals were 
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Enzymatic amplification of genes within the asthma region of 12q23-qter 
was accomplished using primers flanking each exon and the putative 5' 
regulatory elements of each gene. The primers were designed to amplify 
each exon. as well as 15 or more base pairs of each intron on either side of 

5 the splice site. The forward and the reverse primers had two different dye 
colors to allow analysis of each strand, and independent confirmation of 
variants PCR reactions were optimized for each exon primer pair. Buffer 
and cycling conditions were specific to each primer set. PCR products were 
denatured using a formamide dye. and electrophoresed on non-denatunng 

10 acrylamide gels with varying concentrations of glycerol (at least two different 

glycerol concentrations). 

Primers utilized in fluorescent SSCP experiments to screen coding 
and non-coding regions of Gene 214, Gene 224, Gene 422. Gene 436. 
Gene 449. Gene 454, Gene 515, Gene 561, Gene 570, Gene 581. Gene 
698 Gene 702. Gene 722. Gene 748. Gene 751. Gene 757 and Gene 848 
for polymorphisms are provided in Table 8. Column 1 lists the genes 
targeted for. mutation analysis. Column 2 lists the specific exons analyzed 
Column 3 lists the assigned primer names. Columns 4 and 5 list the forward 
primer sequences and the reverse primer sequences, respectively. The 
genes listed in column 1 of Table 8 correspond to the gene identifiers ,n 
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column 1 of Table 4. 
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c^ges in the genes ,.2 3^er ^ ^ fiy 

the initial set of asthma,. « (perWn . Bmer 
fluorescent seguencng on an AB- 37 

App „ed -^^"^ ^rsham-Pharmarta 
Energy Transfer Dye Pnm manufa ct U rer. Primers 

Wowing the standard protocol descnbed y ^ 
US ed for dye primer sequencing are shown n Ta«e9 C ^ 
9 enes Urgeted for seguencir, . 0> = ^ 2-. ^ ^ ^ 

iris^s^- 5 - 6 - - — primer 

names and reverse primer sequences, respectively. 
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Single nucleotide polymorphisms (SNPs) that were identified in genes 
from the disorder region are shown in Table 10. Column 1 lists the gene 
names. Column 2 lists the exons that either contain the SNPs or are flanked 
by intronic sequences that contain the SNPs. Column 3 lists the PMP sites 
5 for the SNPs. Column 4 lists the localization of the SNPs to exon, intron, or 
UTR sequences. Column 5 lists the SNP reference sequences and 
illustrates the SNP nucleotide changes with underlining. Column 6 lists the 
SEQ ID NOs of the SNP reference sequences. Column 7 lists the base 
changes of the SNP sequences. Column 8 lists the amino acid changes 
1 0 resulting from the SNP sequences. 

The "-" symbols denote polymorphisms which are 5' of the exon and 
are within the intronic region. The "-" polymorphisms are numbered going 
from the 3' to 5' direction. The symbols denote polymorphisms which are 
3' of the exon and are within the intronic region. The polymorphisms are 
numbered going from the 5' to 3" direction. The first, second, and third 
columns, combined, correspond to the SNP names as described herein, 
e.g.. 214_B_1. 214_E_+2, etc. It should be noted that the disclosed SNPs 
are referred to herein using both short (e.g.. 757_A_+4) and long (e.g.. 
Gene 757 A +4) nomenclature. 
20 The genomic sequences corresponding to the genes in Table 10 are 

shown in Tables 3A and 3A. Taking the information from Tables 3A and 3B. 
in combination with the last column in Table 4. one of skill in the art could 
identify the entire genomic sequence of the genes and SNPs descnbed 
below. For example, the genomic sequence for Gene 214 is contained 
within BAC clones RP11-702C13 and AC079031 (see Table 4). and the 
nucleotide sequence of BAC clone RP11-702C13 corresponds to SEQ ID 
N0.766 to SEQ ID NO:808. 
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EXAMPLE 10- ALLELE SP ECIFIC ASSAY 

Once variants were confirmed by sequencing, rapid allele specific 
assays were designed to type more than 400 individuals (> 200 cases and > 
200 controls) for use in the association studies. All coding SNPs (cSNPs) 
that resulted in an amino acid change were typed. Neutral polymorphisms 
were typed If: 1) the polymorphism was present in an exon lacking a cSNP; 
2) the polymorphism was present in an exon containing a cSNP. but the two 
polymorphisms were observed to have different frequencies; or 3) the 
polymorphism was in an intronic region adjacent to an exon without a cSNP. 
If results from the association studies appeared positive, additional neutral 

polymorphisms were typed. 

Three types of allele specific assays (ASAs) were used. If the SNP 
resulted in a mutation that created or abolished a restriction site. RFLPs 
were obtained from PCR products that spanned the variants, and were 
> subsequently analyzed. If the polymorphism did not result in an RFLP. 
allele-specific oligonucleotide or exonuclease proofreading assays were 
used For the allele-specific oligonucleotide assays, PCR products that 
spanned the polymorphism were electrophoresed on agarose gels and 
transferred to nylon membranes by Southern blotting. Oligomers 16-20 bp 
0 in length were designed such that the middle base was specific for each 
variant. The oligomers were labeled and successively hybrid.zed to the 
membrane in order to determine genotypes. 

Table 11 A. below, shows the information for the ASAs. Column 1 
lists the SNP names. Column 2 lists the specific assays used (RFLP or 
> 5 ASO) Column 3 lists the enzymes used in the RFLP assay (descnbed 
below). Columns 4 and 6 list the sequences of the primers used .n the ASO 
assay (described below). Columns 5 and 7 list the corresponding SEQ ID 
NOs for the primers. It should be noted that the disclosed SNPs are 
referred to herein using both short (e.g.. 454_E_2; see Table 11A) and long 
30 (e.g., Gene 454 E 2; see Examples 11-13) nomenclature. 
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, RR.P Assay . The amplicon containing the polymorphism was 
PGR amplified using primers that generated fragments for sequencing 
(sequencing primers) or SSCP (SSCP primers). The appropriate populate 
of individuals was PGR amplified in 96-we.l microtiter plates. Enzymes were 
5 purchased from NEB. The restriction cocktail containing the approbate 
enzyme for the particular polymorphism was added to the PGR product 
The reaction was incubated at the appropriate temperature according to the 
manufacturer's recommendations for 2-3 hr. followed by a 4°C incubation. 
After digestion, the reactions were size fractionated using the appropnate 
,0 agarose ge. depending on the assay specifications (2.5%, 3%. or Metapho, 
PMC Byproducts). Gels were electrophoresed in 1 X TBE buffer at 170 V 
tor approximately 2 hr. The ge. was illuminated using UV. and the .mage 
was saved as a Kodak 10 file. Using the Kodak 1D image analy*s 
software, the images were scored and the data was exported to Microsoft® 
^ ^ Excel (http://www.microsoft.com). 

2 ASOassav: The amplicon containing the polymorphs was 
PGR amplified using primers that generated fragments * 
(sequencing primers) or SSCP (SSCP primers). The appropnate populate* 
o, individual was PGR ampltfed in 96-wel, microtHer 
20 into 384-well microtiter plates using a Tecan Genesis RSP200. The 
amplified products were loaded onto 2% agarose gels and s,ze frachonated 
aUSOV for 5 min. The DNA was transferred from the ge. to Hy*ond N 
; membrane (Amersham-Pharmacia, using a Vacuum 
I ftfter containing the blotted PGR ^^^^ 
25 containing 300 ml pre-hybridization solutron (5 X SSPE (ph 7.4) 2 / 

X DenhaLs). The filter was incubated in pre-hybridization so.ufion at 40 C 
for over 1 hr After pre-hybridization. 10 m, of the pre-hybrid*at,on soluhon 
J ; h fter were —ed to a washed glass bo,,.. The alle e-spe<^ 
oligonucleofides (ASO) were designed ^ * - = - ; £ 
30 middle of the nucleotide sequence. The size of tne g 

indent upon the GC content o, the sequence around the polymorphs. 



Those ASOS that had a G or C po.ymorphism were designed so that the T 
Tas between 54-56°C. Those ASOs that had an A or T po.ymorph.rn were 
L signed so that me T m was between 60-64°C. A. oligonudeofides were 
at *e 5 ends and purchased from GibcoBRL. For each 

, polymorphism, 2 ASOs were designed to yieid one ASO for each 

The ASOs that represented each poiymorphism were resuspended 

a concentration of 1 pg/pl. Each ASO was end-labeled wfth ,-ATP (6000 
CM (NEN) using T4 polynucleotide Kinase according to manufa^ 
recommendations (NEB). The end-iabeled products were removed rom the 
0 unincorporated y-ATP 32 using a Sephadex G-25 coiumn according to the 
Inufalre.s — ns (Amersham-Pharmacia). The entire en ab 
pr0 duc of one ASO was added to the bottie containing the 
no 10 m, hybridation soiution. The hybridization reaction was piaced , a 
rotisserie oven (Hybaid) and .eft at 40°C for a m,n,mum of 4 hr. The other 

15 ASO was stored at -20° C. » ho filter was 

After the prerequisite hybridation time had elapsed, the fiiter was 
.moved from the botfie and transferred to 1 L of wash solufion 0, X SSPE 

j *~ /.cor After 15 mm, the Tiner was 
(pH 7.4) and 0.1% SOS) pre-warmed «o 45 C. After 1 

transferred to another liter of wash solution (0.1 X SSPE (pH ) 
20 SOS pre-warmed to 50=0. After 15 min. the filter was wrapped ,n Sa.n 
Wral placed in an autoradiograph cassette, and an X-ray film Kodak) 
Is on top of the filter. Typically, an image was visible wfth,n 1 hr 

was piaceu um imaqes were 

Aft er an image was captured on ^ ima ge 

captured following wash steps at 55 C. 60 C ana 

" TUsO was removed from the filter by adding 1 L of boiling strip 

, , Jm 1 xtsPE (PH 7.4) and 0.1% SDS). This was repeated two more 
solution (0.1 x bs>re u> n ' ^a\-, 0 h in 300 ml pre- 

*k« Acn the filter was pre-hybndized in auu mi h'^" 
« me s. After removing he ASa , e f, e P ^ b x ^ 

hybridization soluhon (5 XWFEJ^ AS0 to the other 

30 40°C for over 1 hr. The second ^ ^ ^ 

strand was removed from storage at -20 C ana ma 
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*k m ml hybridization solution and the 
piaced into a g.ass bott.e a.ong wrth 10 m, hybn ^ 
entire end-.abe.ed product of me second ASOThe hyb 

- — * - trr^JST!Sr- — at 
40 °C for a minimum of 4 hr. After y described ab0 ve. The 

Converter, and the overlaid images were scored. 
* Exonuclease Proofreading Assay: 

^ 1 A Cca « fEPAs} were also employed (see 

U.S. Patent No. 5,391 flu0 rescent tags at the 
po.ymorph.sms of interest were des,gned to cont*n ^ 
* ends. The primers were designed sue hat he 3 en ^ 
variant or consensus nuc.eot.des. ^^^^ ^erase; 

— - - T;r^:c:-; ^ where 

Roche, Germany, Cat. No. 1 64 > d ^ ^ bases . 

we re matched, the resutong «Rp£ electroptl0 re,s or descent 
The tagged bases were detected by g Ge ne 436, 

po.anzat.on. Examp.es of primers used for EPA y 
Gene 454, Gene 570, and Gene 698 are shown ,n the Tabie 
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TABLE 11B: EPA PRIMERS 
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PV AMPLE 1 V AQ^nClATlO V CTIir>Y ANALYSIS 

EXAMP^I^^ tQ determ ., ne * 

. iHate aenes W ere associated with the asthma 

the family based transmission disequ.hbm.rn test (TDT) (N.fe. mo 
"""998. Proc. Na». .cad. SC, USA 95:1,383-93). Case^onfrol 

total of three nunoreu individual was 1) 

n<5 inclusion into the study required that the control 
US . inclusion se | f . re port of never having asthma), 

negative for asthma (as determined by self repo 

2) had no firs, degree relatives with asthma; and 3 ^ 

- — T * rLra—e It affectedeibpa, 
deviated «^^£^^ te 4 — 

fami " rr^Tbe -u" 0, the sKin p« tests were used to select a 
were also collected^ Th ^ ^ negatlve 

, subset of controls that were most likely •* ^ 

A subs et of " - 

„ based on the J--- * =ratjng |dentfty . by . escen t ( ,BO) at 
a given gene. One effected ^ ^ ^ ^ ^ 

^ " es eacb gene in the resio, a larger collection 

!5 appropnate ^ses may y interval was genotyped. A 

of individuals who were IBD across a g ^ ^ 

subset of this collection was used in the analyses 

aff ected individuals and 2 00 mete L of » 

frequencies. This number provided 80 /. powe 
30 or greater between the two groups for a rare allele 0 5/.) 
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significance. For a common allele (50%). the number provided 80 /o power 
to detect a difference of 10% or more between the two groups. 

For each polymorphism, the frequency of the alleles in the control 
and case populations was compared using a Fisher's exact test. A mutauon 
that increased susceptibility to the disease was expected to be more 
prevalent in the cases than in the controls, while a protective mutation was 
expected to be more prevalent in the control group. Similarly, the genotype 
frequencies of the SNPs were compared between cases and controls. P- 
values for the allele test were plotted against a coordinate system based on 
genomic sequence to visualize regions where allelic association was 
present. A small p-va.ue (or a large value of -log (p) as plotted in the figures 
described below, was deemed indicative of an association betweer , fte 
SNPs and the disease phenotype. The analysis was repeated for the US 
and UK populations, separately, to correct for genetic heterogeneity 

2 ^sgciatisn test with in a^duaUNPs: Chromosomal regions 
harboring asthma susceptibility genes were identified by association studies 
using the SNP typing data. Four separate phenotypes were used Un these 
analyses: asthma, bronohial hyper-responsiveness, total IgE. and specie 

l9E ' a a-™- Phenotype : A coordinate system was 

developed based on available genomic sequence, and was used to plot 
»nce values of SNPs and haplotypes according to their relative 

: 9 :r an - <- — - ■» ii - 26) - °rr B B 

genomic sequences were assembled to provide a framewor* for as maj.0 
5 of relative physical distance between SNPs. Where necessary, gaps were 

in ^:g P n™Mp.va,ues ) for allelic association of a„ = 
SNPs to the asthma phenotype are plotted in Figure 11 

, a „H Fioure 12 (US and UK populations, separately), 
population, and Figure ,12 ( ^ M phenotype 

*n Freauencies and p-values for SNPs assoud 

50 Frequen k cfor tne combined populat.on and for 

are shown in Tables 12A, 12B, ana 



the UK and US populations, separately. ^^ZZ^ 

an0 3 M the control ( CNTL )*» Q $ ^ 

respectively. Coiumns 4 and 5 « tea ^ ^ ^ ^ ^ 

fr e q uendesandsarnp 1 e S ,zes(N).respea,vely ^ 

p.value for the comparison between the case 

s , nifi rasso.at to n - the asthma — 
, Ration, when comparing the allele freguenc the cas 

rr=hi» 12A1 When analyzing the population separately, » 
9roUPS 4 rLn?757 Gene 698 anri Gene 561 showed a signrtcant 
Gene 454, Gene (oi, ^ NR Gene 

nation in the OK ^ ^^'phenotype in the 
«• anri Gene .1 showed a J^U- — ^ 
5 US population (Table 12C). Additional S NPsin 

comparing trie ^^^^tT.Z Gene 570 in 
Gene 436 in the combined populate, and m Gene 

Seven SNPs ,n Gene 4* * 7 wo 

20 different in the cases versus the controls n ^ % 

SNPS in exon O were more freguent ; Mh -^, y) These 

respectively) than in the cases 02 . and 33 ^ ^ ^ 

prances were statistically sign-ficant (p-OX) an P ) ^ ^ 
valu es obtained for me genotype T^^"^ (p=0.03). 

25 SNP also ,eached statistical s,gn,ficance n*e UK p ^ ^ 

^^""•r^ZZZZZ acid change of 
population. The first SNP m exon O results ^ 
gamine to arginine. In add-on, one SNP m exon 
exon M, reached sfcfisfica, -^"i*,^ and had 

30 S^rL^^"^-""- 



I so 

, , a nri the allele frequencies were 42% in controls 
combined population, and tne aiieie ue M 

W in cases The p-value was 0.02 for the US sample, and the 
versus 32% in cases. in»r The genotype 

allele frequencies were 41 % in controls versus 23 /. of cases. g »~ 
caparison was significant (p=0.02) in the combined populate. The 
ZZ sH* showed signmcance for ho. the allele and genotype tests , 
2 comhined population (p=0.007). for the a.lele companson ,n the US 
<n-o 021 and for the genotype comparison in the UK (p=0.03). 
lP Th « other SUPS reach statical signmcance in the combmed 
p-JT-i the US or UK populations, alone. 1) SNP E 2. wh,ch 

~ cr: p — - 

aal and genome tests in the UK population. aUele frequences of 51 A« 
Inland 63 i in cases); 2) SNP H 1. an arginine to hisfidine amm acd 
5 lange (p=0.003 for me a.lele and p=0.002 for the genotype tea* m the 
5 cnanyc, vk . -Qptro s and 33% in 

combined population, allele frequences of 22/. m the con 
the cases- p=0.04 in the UK population, allele frequences of 23/. ,n cont 

^2-/ in cases- p-0.03 for the allele and p=0.02 for the genotype test » 
and 32 ^ in cases, p u 36% jn eases); 

-US Populate,, |— ^ e and p=0 . 003 for the 

genotype tests m „„j „-n n3 for the genotype 

, in cases- 0=0.04 for the allele and p-O.OJ tor me a 

controls and 24 /. in cases, p u 21% ^ 

tests in US population, allele frequences of 37/. 

COntr °'^ W in Gene 757 reached statistical significance for the aUele 
2£ One SNP in Gene conttied 

te st in the combined and UK popu £ s^SNP A P ^ ^ ^ 
population. aliele frequences of 18/. . con ^ ^ 

in the UK sample, allele f^ence . £7*. co^ ^ ^ 

Another SNP in the same Exon (A 4) reacn 
30 genotype test in the combined population (p«0.05). 
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Multiple SNPs in Gene 561 reached statistical significance in either 
the combined population or in the US or UK populations, separately. SNP J 
1 was significant in the combined population (p=0.04, allele frequences of 
15% in controls and 9% in cases); SNP Y *1 was significant ,n the UK 

5 sample ( P =0.002. allele frequencies of 5% in controls and not present .n 
cases)- and SNP H 1 was significant in the US population (p=0.02. allele 
frequencies of 10% in controls and 25% in cases). SNP H 1 also showed a 
significant genotype p-va.ue in the combined population (p=0.03) and ,n the 
US population (p=0.01), while SNP Y +1 showed a significant genotype p- 

10 value (p=0.001) in the UK population. None of these SNPs resulted ,n 

amino acid changes. 

A single SNP in Gene 214 reached statistical significance ,n the 
combined population (p-0.04. allele frequencies of 28% in controls and 36% 

in esses). , 
15 For Gene 436, one SNP (E 1) showed a significant genotype p-value 

in the combined population (p=0.04). 

One SNP in Gene 698 (E 1) reached statistical significance ,n the UK 
population (p=0.01 for the allele test, p=0.02 for the genotype test, allele 
frequencies of 5% in controls and 12% in cases). This SNP resuits ,n an 

90 arainine to lysine amino acid change. 

SNpI in two genes. Gene 515 and Gene 570, showed secant 
genotype p-values in the US population alone (515 A 1. P=0.007; 515 A 2. 
p=0.005; 51 5 A 4; p=0.001 ; 570 F 1 , p=0.007). 
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TABLE 12A- ASSOCIATE ANALYSIS OF ASTHMA PHENOTYPE 
TABLE 1ZA. as COMB|NED US , UK POPULATION 





TABLE 120= ASSOCAT.ON ^K^™*™" 0 ™ 
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mg/ml) or PC a (16). as described in the Linkage Analysis section. (Example 
3) First, sibling pairs were identified where both sibs were affected and 
satisfied this new criteria. Of these pairs, one sib was included in the" 
case/control analyses if they showed evidence of linkage at the gene of 
5 interest. This phenotype was more restrictive than the Asthma yes/no 
criteria; hence the number of cases included in the analyses was reduced 
approximately in half. Where the PC»(16) subgroup represented a more 
genetically homogeneous sample, one could expect an increase in the effect 
size compared to the one observed in the original set of cases. However. 
10 the reduction in sample srze cou.d result in estimates that were less 
accurate. This, in turn, could obscure a trend in allele frequencies ,n the 
control group, the original set of cases, and the PC»(16) subgroup. In 
addition, the reduction in sample size could induce a reduction in power 
(and increase in p-values) in spite of the larger effect s,ze. 
15 The significance levels (p-values) for allelic association of all typed 

SNPs to the BHR phenotype are plotted in Figure 13 (combined population) 
and Figure 14 (US and UK populations, separately). Frequencies and p- 
values for SNPs associated with the BHR phenotype are shown ,n Tab es 
13A , 13B. and 13C for the combined population and for the UK and US 
20 populations separately. 

-p.nl P 13A . ASSOCIATION ANALYSIS OF BHR PHENOTYPE 
TABLE 13A. ASSOC.A1 ^ popULAT|QN 



Combined US and UK 



1.0312 




0.1353 
0.0181 

0.2923 
0.2621 
0.8944 
0.4893 
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214 E +1 
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TABLE 13B 



• ASSOCIATION ANALYSIS OF BHR PHENOTYPE 
UK POPULATION 



UK population 
GENEEXON 




0.4925 
1.000C 
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561 J 1 




TABLE 13C: ASSOCIATION ANALYSIS OF BHR PHENOTYPE 
!M US POPULATION 





obs ewe d for *. phenc ype - ^ 

Gen e4M.G e ne 757.ana Gene 5^ ^ a „ ele 

frequencies in the com. 



209 



populations separately. SNPs in Gene 454 ana Gene 561 showed a 
significant association in the UK population alone, while SNPs in Gene 581 
showed a similar association with the pheriotype in the US population. In 
addition, the genotypic comparison yielded significant results for SNPs ,n 
5 Gene 570 in the US population and in Gene 214. Gene 454 and Gene 561 
for the UK and combined population (see Tables 13A-13C). 

The most significant results were obtained for Gene 454. where SNP 
E 2 showed a p-value of 0.007 for the allele test and p-value of 0.01 for the 
genotype test in the combined population (49% in control vs. 64% in cases). 
10 SNP E 2 was also significant in the UK population alone for the allele 
( p=0.02) and genotype (p-0.04) tests. Two more SNPs reached statistical 
significance in Gene 454 for this sub-pheno«y P e: 1 ) SNP H 1 0-0.02 .m the 
combined population. 22% in controls vs. 33% in cases; p<0.05 for 
genotypic test in the UK population); and 2) SNP F -2 (genotype p-value of 
15 o 03 in the combined population). 

For Gene 757, SNP A 2 was significant with a p-value of 0.03 ,n the 
combined population (18% in controls vs. 29% in cases). 

One SNP in Gene 561 was significant in both the cantoned 
population and in the UK population alone (p=0.03 for both the allele and 
20 genotype tests in foe combined population. 5% in controls vs. not present ,n 
case^O.05 for allele and p=0.04 for genotype in UK. 6% in controls vs. 

not present in cases). 

Gene 214 was significant in both the combined and UK populations 
when comparing the genotype frequencies between the cases and controls 

25 (p-0.01 combined population. p=0.02 UK population). 

cmd in r,pne 581 reached statistical 
In the US population, one SNP in Gene aoi re* 

significance (F + 2. p=0.04. 24% in controls vs. not present in cases. The 

comparison o, genotype frequencies also yielded a s,gntfcan, result for 

Gene 570 (SNP F1,p<0.05). 

c TotaLJaE: The analyses were performed usmg 

~" ( Mnc iov/»i«; as described in the Linkage 

asthmatic children with elevated total IgE levels, as 
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Ana , y sis section (Example 3, First, sihiin 9 

wa s induded ,n the J was mQre restrictive than the 

Z "r S ~ cases inched in the ana, y ses 

in Fiaure 15 (combined population) 

14A. 14B. and 14C <or the combined populate and for the 
populations, separately. 

TABLE U* *^^a-p^^"^ 





"TABLE US: ASSOCT.ON «OJ TOTAL * PHENOTYPE 




0.3443 
0.1334 




TABLE 14C: ASSOCIATION ANALYS.SOF TOTAL IgE PHENOTYPE 




214 
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For the total IgE phenotype. SNPs in Gene 454, Gene 436 and Gene 
214 showed a significant association in the combined population when 
comparing the allele frequencies in the case and control groups. When 
analyzing the population separately. SNPs in gene 454 were significant ,n 
both the UK and US populations, separately, while SNPs in Gene 6 9 8 and 
Gene 561 showed a significant association in the UK population. Add,t,onal 
significant results were identified when comparing the genotype frequences 
in the case and control groups. SNPs in Gene 454 (US. UK. and 
combined). Gene 515 (US). Gene 570 (US). Gene 757 (UK and combined) 
Gene 6 9 8 (UK), and Gene 561 (US. and UK), reached statist,*.! 
significance. 

The most significant results were obtained for Gene 454. where 6 
SNPs showed significant association with the phenotype at the allelic level 
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in , he combined popu.ation and in one of the subpopu.a«,ons. SNP H 1 
snowed hfchly signtfcan, results in the combined and US pop U a t ,ons 
(p =0 003 for .he allele test and p=0.001 for the genome test n the 
combined population. 22% in control vs. 35% in cases, P =0.0001 ,n US for 

. ■ . uv in rasesl Two other SNPs had p- 
hnth tests 20% in controls vs. 64% in cases), i wu u 
botn tests, t» (p=0.009 for the 

values <0.01 in the combined populaton. 1) SNP F (P u 
allele test and P =0.01 for the genotype test in the combined populate. 35 . 

cases)- and 2) SNP O 1 (P=0.004 for the allele test and p=0.02 for the 
genotype test in the combined population, 19% in controls vs. 9% m cases; 
D -0 01 and p=0.04 for the allele and genotype tests respect,vely in UK, 19 /. 
^controls vs. 8% in cases). Another SNP in exon O (O 6) had a p-value « 
the significant range (p=0.02 and p<0.05 for the allele and genotyp^ 
respectively, in the combined population. 42% of controls vs. 31 A .« cases 
D=0 03 for the allele test and p-0.01 for the genotype test m US. 39 A of 
co t ols vs. 14% of cases; P<0.05 in UK for genotype test, In add = ^ 
SNPs in high linkage disequilibrium with each other reached statistical 
igni cance in exon M: 1) M 1 (P=0.01 and p=0.04 for me allele and 
genotype tests, respectively, in the combined population, 42% ,n controls ^ 
, 2 9vTcases- p=0.003 for the allele test and p=0.02 for the genotype test ,n 
i 29 /o m cases, p u. , D=0 0 1 for the allele test 

US 41 % in controls vs. 6% in cases); and 2) M 1 (P o.m to 
an d p=0.03 for the genotype test in the combined populate. 43, n 
controls vs. 31% in cases; p=0.02 and p-0.01 for the allele and genotype 
tests respectively, in US, 41% o, controls vs. 14% of cases). 
5 Gene 436 and Gene 214 both showed a single SNP 

statistical significance in the combined population on,, In G«, ^ 
♦ r php 454 SNP K -2 was significant (p=0.03, 1 5 /o in conirois 
is adjacent to Gene 454 SNP g simiiar ^ of 

vs 7% of cases), while in Gene 214, bNK t 
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no occurrence in cases) while SNP H 1 showed a significant genotype test 

in the US population (p=0.04). 

A single SNP in Gene 698 showed a significant association with the 
total IgE subphenotype in the UK population (p=0.01 for the allele test and 
5 p=0.02 for the genotype test. 5% of controls vs. 1 3% of cases). 

For Gene 757. SNP A +4 showed a significant genotype test in both 
the combined and the UK samples (p<0.05 combined, p=0.03 UK). 

SNPS in two genes. Gene 515 and Gene 570. had significant 
genotype p-values in the US population alone (515 A 1. p=0.02; 515 A 2. 
1 0 p=0.02; 51 5 A 4; p=0.03; 570 F 1 . p=0.04). 

d Specific IgE : The analyses were performed using 
asthmatic children with elevated specffic IgE levels for at least one allergen, 
as described in the Linkage Analysis section (Example 3). First, siblmg 
pairs were identified where both sibs were affected and satisfied this new 
1 5 criteria. Of these pairs, one sib was included in the case/control analyses ,f 
they showed evidence of linkage at the gene of interest. This phenotype 
was more restrictive than the Asthma yes/no criteria; hence the number of 
cases included in the analyses was reduced by approximately 38%. 

The significance levels (p-values) for allelic association of the typed 
20 SNPs to the specific IgE phenotype are plotted in Figure 17 (combined 
population) and Figure 18 (US and UK populations, separate^ 
Frequencies and p-values for SNPs associated wKh the specific IgE 
phenotype are shown in Tables 15A. 15B. and 15C for the combmed 
population and for the UK and US populations, separately. 

TABL E 15 A: ASSOCATION «^ """"" 



GENOTYPE 
P-VALUE 



25 





TABLE 15B: ASSOCIATION ANALYSIS OF SPECIFIC IgE PHENOTYPE 

UK POPULATION 
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TABLE 15C: ASSOC.ATK.N ANALYSIS OF^EC,P.C , 9 E PHENOTYPE 
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For the specific IgE subphenotype. SNPs in Gene 454 and Gene 757 
showed a significant association in the combined population when 
comparing the allele frequencies in the case and control groups. When 
analyzing the populations separately. SNPs in Gene 561 showed a 
5 significant association in both the US and UK populations. In addition, five 
SNPs in Gene 454 showed association with the subphenotype in the US 
population. Gene 6 9 8 contained a SNP reaching statistical significance ,n 
the UK population only. Additional signfficant results were identffied when 
comparing the genotype frequencies in the case and control groups. SNPs 
10 in Gene 515. Gene 561, and Gene 454 reached statistical signfficance ,n the 
US population. SNPs in Gene 454, Gene 561. and Gene 6 9 8 were 
significant in the UK and in the combined population. In addition, a SNP in 
gene 757 was significant at the 0.05 level in the combined population. 
' The most signfficant results were found in Gene 454, where 6 SNPs 
15 yield significant association with the subphenotype in the combined 
population. SNP H 1 showed highly significant results in the combined and 
US populations (p=0.0006 and 0.0003 for the allele and genotype tests 
respectively in the combined population, 22% in control vs. 38% « cm-. 
p=0 0002 for the allele test and p=0.0006 for the genotype test ,n the US 
20 population. 20% in controls vs. 53% in cases; genotypic test p*O06 :m to. 
UK population). Two SNPs in exon M gave significant results: 1) M 1 
(p =0.01 and p=0.03 for the allele and genotype tests in the combined 
popu.ation, 42% in controls vs. 30% in cases; p=0.003 for the allele and 
p=0 02 for the genotype test in US, 41% in controls vs. 13% in cases); and 
25 2, M ♦ 1 (P-0.01 and p=0.02 for the allele and genotype tests respe^y. 
in ,he combined population. 43% in controls vs. 31% in cases p-0.00 * 
the allele and p=0.02 for the genotype test in US, 41% of controls vs. 1 .of 
cases). Three other SNPs had p-values <0.05 in the combined populafio. 

1) SNP E 2 (p=0.04 and p=0.02 for the allele and genotype tests 
30 respectively, in the combined population. 49% in controls vs. 59% in cases); 

2) SNP F -2 (p-0.03 and p=0.008 for the allele and genotype tests 
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respectively, in the combined population. 35% in controls vs. 24% in cases; 
p=0.01 for the allele test in US, 37% in controls vs. 15 % in cases); and 3) 
SNP O 6 (p=0.02 and p<0.05 for the allele and genotype tests respectively, 
in the combined population, 42% in controls vs. 31% in cases; p=0.02 for the 
allele test and p=0.04 for the genotype test in US, 39% in controls vs. 18% 
in cases). 

For Gene 561. SNP Y +1 reached statistical significance in the UK 
population (p=0.02 and p=0.01 for the allele and genotype tests 
respectively. 6% in controls vs. no occurrence in cases) while SNP H 1 had 
a significant p-value in the US population (p=0.01 and p=0.02 for the allele 
and genotype tests respectively. 10% in controls vs. 29% in cases). 

A single SNP in Gene 698 showed a significant association with the 
specific IgE subphenotype in the UK population (p=0.02 and p=0.03 for the 
allele and genotype tests respectively, 5% of controls vs. 13% of cases). 

For Gene 757. SNPs A 2 and A 4 showed a significant assoc.at.on 
with the subphenotype in the combined population (A 2 p<0.05. 18% in 
controls vs. 27% in cases; A 4 p=0.04 for both the allele and genotype tests. 

2% in controls vs. 5% in cases). 

Additionally, three SNPs in Gene 515 had significant genotype p- 
values in the US population alone (A 1 , p=0.02; A 2, p=0.03; A 4; p=0.03). 

in summary, evidence obtained from association studies implicated 
several genes in the 12q23-ter region as being involved in respiratory 
diseases. This was supported by analysis of the asthma (yes/no) 
phenotype, BHR phenotype, total IgE phenotype, and specf.c IgE 
5 phenotype in asthmatic individuals. Thus, chromosome 12q23-ter 
encompassed genes involved in asthma and related diseases thereof. 

PY AMPLE 12: »*P' OTYPE ANALYSES 

,n addition to the analysis of individual SNPs, haplotype frequences 
between the case and control groups were also compared. The hap.otypes 
,0 were constructed using a maximum likelihood approach. Since ex.sting 
software for predicting hap.otypes was unable to utilize individuals wrth 
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missing data, a program was developed to analyze all individuals. This 
provided more accurate haplotype frequency estimates. Haplotype analysis 
based on multiple SNPs in a gene was expected to provide increased 
evidence for an association between a given phenotype and that gene if all 
5 haplotyped SNPs were involved in the manifestation of the phenotype. In 
other words, allelic variation involving the haplotyped SNPs was expected to 
be associated with dKferent risks of or susceptibilities to the phenotype. 

The estimated frequency of each haplotype was compared between 
cases and controls by a permutation test. An overall comparison of the 
10 distribution of all haplotypes between the two groups was also performed 
For each gene with two SNPs or more, all 2-at-a-time haplotypes were 
constructed, and their frequencies were compared between the case and 
control groups. P-values for the overall comparisons were plotted agamst a 
coordinate system based on genomic sequence (average location of the two 
, 5 SNPs in the haplotype). This was used to visualize regions where haplotype 
association was present. A small p-value (or a large value of -log <p) i as 
piotted in the figures described below) was indicative of an association 
between the haplotyped SNPs and the disease phenotype. The analysis 
was repeated for the US and UK population, separately, to adjust for the 
20 possibility of genetic heterogeneity. 

1 Asthma nhenotvoe: Figure 19 (combined populate) and 
Figure 20 (US and UK populations separately) shows the results for the 
haplotype analysis (2-at-a-time) for all SNPs in Gene 214. Gene 436 Gene 
454, Sine 515. Gene 561. Gene 570, Gene 698. Gene 702. Gene 722. and 

25 Gene 757. p . 

The most significant associated haplotype was formed by SNPs E -1 
and E *1 from Gene 214, which had a p-value of 0.00001 in the combined 
population (p-0.00002 in UK, non-significan, in US). This SNP oomb^on 
was much more signffican, than the analysis of these SNPs atone 
30 (combined population p=0.04 for E + 1 and p = 0.93 for E -1 E^ee SNP 
combinations had p-values < 0.01 in gene 454 in the combined population. 
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with the most significant haplotype consisting of SNP E 2 and F -2. This 
haplotype had a p-value of 0.001 in the combined population (p=0.008 ,n the 
US p<0 05 in UK). Although this result was more significant than the 
analysis of these SNPs alone, the levels of significance found ,n the 
haplotypes of Gene 454 were comparable to the significance obtained from 
the analysis of the SNPs alone (in the combined population: E 2 and M +1 , 
p=0 003- G -1 and M *1. p=0.004; E -1 and E 2. p=0.004; E 2 and H 1. 
p=0 004- E 2 and O 6, p=0.004; E 2 and M 1, p=0.004: H 1 and O 3, 
p=0 005; E 1 and E 2, p=0.005; B 1 and H 1, p=0.006; E 1 and H 
p=0 006; E 1 and M +1, P=0.007; E 1 and F -2. p=0.007; B 1 and E 2. 
p=0.007; G -1 and H 1, P =0.008; H 1 and O 1, P=0.008 ; F -2 and M 
p=0.009;E2andG-1.p=0.01). 

,n Gene 561. a single haplotype (J 1 and H 1) reached stafisfica 
significance at the 0.01 level in the combined population (p=0.008), while all 
seven haplotype combinations with SNP Y ♦ 1 yield signWicant results at the 
0 01 level in the UK population (P 1 and Y + 1. P=0.0006; C 1 and Y + 1 
p=0.0007; E 1 and Y *1. P=0.0008; J 1 and Y +1 . P=0.001; H 1 and Y + 1 
p=0 001; B *1 and Y 1. P=0.002; B 1 and Y +1, p=0.002 ; Y *1 and X -3 
p=0 004). The SNP combination of H 1 and E 1 had a secant 
association in the US population (p=0.009). In addttion, in the combed 
population, the haplotypes formed by SNPs A2 and A *4 in gene 757 w^e 
Lore significantly associated with the disease (p=0.004) than any of these 
SNPs alone (p=0.03 for A 2. p=0.60 for A +4). 

2 Bmncm^m^s^^. A similar test for assocafion of 2 
SNP-at-a-time haplotypes with BHR (PC 2 „ , 16 mg/ml, was performed. In 
Figures 21 and 22. the haplotype analysis (2-at-a-time) for all SNPs ,n Gene 

Gene 702. Gene 722, and Gene 757 is shown for the combined populafon. 
and for the UK and the US populations, respectively. 

The most significant associated haplotype was formed by SNPs E 1 
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popular (p=0.0002 in UK. non-significant in US). Four SNP commons 
had p-va,ues <0.01 in Gene 454 in the combined popu.ation. (E -1 and E 2. 
p=0 004- E 1 and E 2. p=0.004 ; E 2 and G -1. P=0.005; E 2 and F -2. 
Uooei. and one SNP combination in the UK (E 1 and E 2. p=0W). In 
5 Gene 56, . .our hapiotypes reached statistica, signmcance a, the 

in the combined population (J 1 and H 1 , P=0.003; E 1 and Y +1 p=0.003 
1 and E 1 P=0 006; J 1 and Y +1. P=0.009). one in the UK populate (J 1 

* " " r J . , ,c iu 1 and E 1 0=0.002). In addition, in 

and Y +1, p=0.01 ). and one in the US (H 1 and E . p u 

the combined population, a haplotype formed by SNPs ,n Gene 757 (A2 and 

1 n A +4 0=0 003) was significant at the 0.01 level. 

3 TotaUgE: A similar test for association of 2-SNP-at-a-time hapiotypes 
with elevated levels of total IgE was performed. In Figures 23 and 24^ he 
haplotype analysis (2-a,a-time) for all SNPs in Gene 214. Gene 436 Gene 
Z Gene 515. Gene 561. Gene 570, Gene 698. Gene 702. Gene 722. and 

„ Get 757 is shown for the combined and for the UK and me US 

populations, respectively. 

The most significant associated haplotype was formed by SNPs E 1 
and E *1 from Gene 214. with a p-value of 0.000003 in the combined 
population ( p=0.000005 in UK. non-sign.cant in USX Th,rteen SNP 
20 combinations had p-vaiues <0.01 in gene 454 in the comb,n« * 
1 and O 1. P=0.002; H 1 and O 1, P=0.002; O 1 and O 3. p-0^004. E 1 and 
oT p =0 005; G -1 and H 1 . p=0.006; H 1 and O 3. p=0.007; F -2 and M . 
° 0 008 H 2 and O 1. P=0.008; B 1 and O 1. P=0.009; M 1 and O . 
;: 0 009 : G , and M ♦ , P-9 - O 1 P^. ^ 

9* D =0 01) one SNP combination in the UK ( K 1 ana u 

25 p-u.ui ), w p=0.0001 ; E 1 and H 
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p=0OOV, M 1 and O 5. p=0.002; H 2 and M 1. p=0.002; E 1 and M 
p=0 002; K 1 and M 1, p=0.002; F -2 and M 1, P=0.004; L -1 and M 
p=0 005; M 1 and M 2, p=0.005; B 1 and M 1 , p=0.005; F -2 and G -1. 
p=0 005-. M 1 and O 1, p=0.006 ; E 2 and M 1, p=0.006; M 1 and M 
p=0 007) in Gene 561 , three haplotypes reached statistical significance at 
,he 0 01 level in the UK sample (E 1 and Y +1, p=0.008; C 1 and Y 
p=0 008- H 1 and Y +1 , P-0.009), and two reached statistical significance m 
the US sample (E 1 and C 1. p=0.004; E 1 and Y +1, P=0.004). .n Gene 
757 the haplotype formed with SNP A2 and SNP A + 4 was significant at the 
0.01 level in the combined population (p=0.002). and in the UK populafion 
(p=0.006). 

4 SEecificlgE: A similar test for association of 2-SNP-at-a-t,me 
haplotypes with elevated levels of specific IgE was performed. In Figures 25 
and 26. the haplotype analysis (2-at.a-.ime) for al, SNPs in Genes 214 436 
454, 515, 561, 570, 698, 702. 722 and 757 is shown for the combined and 
for the UK and the US populations, respectively. 

The most significant associated haplotype was formed by SNPs E 
and E + 1 from Gene 214, with a p-value of 0.000006 in the combined 
population (p-0.000003 in UK, non-significan, in US). Sixteen SNP 
combinations had p-values <0.01 in gene 454 in the combined populate , <H 
,a„d03, p=0.0007; H 1 and K 1 , p=0.002; G -1 and H 1 , p=0.002 ; H 1 and 
O 1 p=0.003; E 2 and H 1, p=0.003 ; H 1 and H 2, p=0.003; E 1 and H 1 
p=o'.003; B 1 and H 1. P=0.003; H 1 and M 2, p=0.003; H 1 and O . 
p=0.004; H 1 and M + 1, p=0.004 ; H 1 and U -1, P=0.004; F -2 and H 1 
,0.004; H 1 and M 1, P=0.005 ; E -1 and H 1. P=0.006; H 1 and O . 
p=0.007), and thirty-three SNP combinations in the US (H 1 and , W M . 
U.0005; E 1 and H 1. P=0.0006; H 1 and O 5, pMMJDW H - 
U.0008; H 1 and M 1. p=0.0009; H 1 and K 1. ^ ' ^ ^ 
p =0 001- K 1 and M 1, P=0.001; H 1 and H 2. p=0.001; M 1 and O 3. 
P p =0 00 • O -1 and H 1. p=0.001; H 1 and O 3. p=0.001; B and H . 
P p=0 001 H 1 and L -1, P=0.00 2; E -1 and H 1, P=0.002; E 2 and H 1. 
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p =0 002- H 1 and M 2. p=0.002; H 1 and O 1. p=0.002; M 1 and M *1. 
p=0 002 ; K 1 and M +1 , P=0.003; M 1 and O 5. p=0.003; M -1 and O 5 
^al; E 2 and M 1, p-0.005; F -2 and K 1. P=0.005; E 1 and M 
U.005 K 1 and O 6. p=0.006 ; M + 1 and O 6. p=0.006; H 2 and M l. 
0 -0 007- F -2 and O 3. p=0.008; M *1 and O 3. p=0.008; E 2 and M +1 . 
"ol; M 1 and O 1, p=0.00 9; O 5 and O 8, p=0.00 9 >. , Gene 56, two 
Lplotypes reached statistic, signfficance a, the 0.01 .eve. -n the UK 

C 1 and Y + 1, P=0.007). and t«o 
population (E 1 and Y +1. P u.uua, n - n r>07- 
eached statistic*, significance in the US popu.ation (E 1 and C V p-a00L 
H 1 and E 1, P=0.009). in Gene 757. the hapiotype formed w,th SNP A2 
l d SNP A ^ was secant at the 0.01 .eve, in the combined popu.a„on 

(p=0.002) and in the UK sample (p=0.006). Atonal 
,n summary, hap.otype ana.ysis of the SNPs prov.ded addrtona, 
evidence demonstrating the presence of asthma susceptibly genes on 
Lmosome 12. in some SNP combinations, the level of s.gn.f.cance o, 
association was increased by an order of magnrtude. 
cv ample 13- t- amqmisSION B iSES ppll IM TEST (TDT1 

, , rrDTl was conducted for Gene 454. By selecung a s.ngle affected 
test (TOT), was ondu _ ^ ^ ^ tQ 

[ 2L a P a rt ,cu,ar a„e,e or genotype was p = a,,y — .an 
atf ec,ed individual over what would be expece d by 

families that contributed to the linkage s.gnal. The s.gnmca 
tne same ^J^™^ chain Monte Carlo simulation methods as 
levels were estimated by Markov o Department of 

cemented in TDTEX from the SAG. ^ Education and 
30 Epidemiology and Biostatistics. t-^JZTZ**. Univereity . 
Research. MetroHealth Campus. Case Western 



Cleveland. OH). As only heterozygote parents contributed information to the 
TDT test. SNP hapiotypes (all 2-a.-a-.ime and all 3-at-a-time, were also 
constructed based on family data with the program GENEHUNTER 
(Kruglyak e. al.. 1996). This served to increase the informa.iveness of the 
5 single SNPs. These hapiotypes were then used as "alleles" in future TDT 
analyses. In addition, p-values obtained from the TDT analyses were 
compared to the p-va.ues obtained from the haplotyping in the oase/contro. 
setting To check for consistency, p-values. associated with testing 
frequencies in cases and controls, were examined when selecting the 
1 0 overtransmitted alleles or genotypes identified in the TDT test. 

1 asthma Phenotvoe : Three candidate SNPs for Gene 454 
were typed in the extended population in order to investigate further the 
association seen in the case- control study. All three SNPs resuit in ammo 
acid changes (E 2. histidine to tyrosine (P-T* H 1 and H 2. arginine o 
, S histidine (G-+A)). Results are shown in Table 16. Column 1 lists he 
exon(s) containing the SNP(s) of interest. Column 2 Ms the 
overtransmitted alleles or genotypes. Column 3 lists the TDT p-va^ 
Columns 4. 5. and 6 list the p-values. the frequencies in the cases, and the 
frequencies in the controls of the overtransmitted alleles or genotypes. 

20 respectively. , ^ 

Since the TDT was not influenced by admixture, it was performed 

using the combined US and UK populations. For SNPs E 2 and H V*e 
genotype formed by the CA/CA hapiotypes was significantly overtraded 
to the affected individuals (p-0.04). In addition, this genotype was found m 
25 only 2% of the controls while 12% of the cases harbor this genotype This 
difference was high, significant <p=0.0002). For the SNP —on 
comprising H 1 and H 2. the AG/AG genotypes were overtransrmtte to 
arfeld individuals. This result approached the statistical .eve, of O06 
(p -.0.06). Moreover, this genotype was more frequent in the cases ,14 « 
30 compared to the controls (2%), and this difference was highly s,gn*can 
p=0 00005). The TDT results supported the association previously 
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observed in the case-control studies for Gene 454. The results also pointed 
to a recessive mechanism of transmission, as the genotype test showed the 
strongest evidence of association. 

TABLE 16: TDT ANALYSIS OF ASTHMA PHENOTYPE 



Asthma Yes/No 



Combined US and UK 



Exon 



Over-Transmitted TDT 
Allele | P-value 



454 E 2 



454 H 1 



454 E 2 H 1 



:ase/Control 
>-value 



1 oooc 



Control 
Freq 



Case 
Freq 



0.0058 50.9% 



0.34E 



0.0032 



CA 



454 E 2 H 2 



CG 



454 H1H2 



AG 



454 E 2 H 1 H 2 CAG 



0.1094 



0.3874 



0.4612 



0.0900 



0.7801 



0.0008 



0.0097 



0.0036 



0.2167 



l Exon 
154 E 2 



Over-Transmitted 
Genotype 



154 H 1 



TT 



AA 



154 H 2 



GG 



CA/CA 



154 E 2 H 2 



;g/cg 



DT 
-value 



0.0015 



22.1% 



98.0% 



15.4% 



48.4% 



20.0% 



15.3 e 



39.7% 



33.0% 



97 .4 e 



27.6% 



59.4% 



30.4% 



26.8% 



^ase/Control 
-value 



0.837* 



0.0070 



0.1051 



0.0022 



Control 
Freq 



Case 
Freq 



28.6% 



4.9 C 



0.1101 



0.7776 



0.035S 



0.0002 



0.282J 



0.2477 



454 E 2 H 2 



rG/TG 



M3/AG 



F ? H 1 H 2ICAG/CAG 



0.282J 



96. 0 C 



1.5« 



26.8% 



0.0637 



0.0038 
0.0000 



13.7% 



17.0% 



94.8% 



11.6% 



33.0% 



0.0871 



0.0001 



26.3% 
2.0% 



12.2% 



1.0% 



14.3% 



10.7% 



2 Rmnrhial Hvper-r^nonsiveness : The TDT analyses were 
repeated using only the asthmatic pairs that satisfied the additional criteria 
of having a PC 20 ^ 16 mg/ml (Table 17). As for the case of the asthma 
yes/no phenotype, significance was reached with the genotypic TDT test. 

10 For this subphenotype, genotype AA of SNP H 1 was overtransm.tted to 
affected individuals (p=0.04). This genotype was also present more often .n 
the cases than in the controls (17% cases, 5% controls, p=0.02). Two 
haplotype combinations had overtransmitted genotypes that approached 
statistically significant levels: genotype CA/CA for SNPs E 2 and H 1 

1 5 (p=0 06) and genotype CAG/CAG for SNPs E2, H1 and H2 (p=0.06). Both 
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of these genotypes were found more often in the cases (CA/CA 13%. 
CAG/CAG 11%) than in the controls (CA/CA 2%. CAG/CAG 1%). and these 
differences were highiy significant <p=0.0008 for CA/CA. p=0.0014 tar 
CAG/CAG). 

TABLE 17: TDT ANALYSIS OF BHR PHENOTYPE 
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3 Total lg£ The TDT analyses were also performed using the 
phenotype previous described for total IgE (Tab,e 18). Again, significance 
was reached with the genotypic TDT test. For this subphenotype. genotype 
AA of SNP H 1 was overtransmitted to affected individuals (p=0.03). This 
genotype was also present more often in the cases than <" *. «»** 
(21% cases, 5% controls. p=0.0001). Two genotypes for the SNP 
combination formed by E2 and H1 had statistic^ signer* 
ovehransmission: genome CA/CA and genotype «T M^J » 
genotypes were found more often in the cases (CA/CA 12%. CA/TA 9 A) 
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than in the controls (CA/CA 2%. CA/TA 3%). and these differences were 
significant (p=0.0009 for CA/CA, p=0.03 for CA/CT). 

TABLE 18: TDT ANALYSIS OF TOTAL IgE PHENOTYPE 
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4 Specific IgE: The TDT analyses were performed using the 
phenotype previous described for specific IgE (Table 19). There were no 
alleles or genotypes that were significantly overtransmitted at the 0.05 level. 
However, the test for the overtransmission of genotype AA SNP H 1 had a 
p-value <0.1. This genotype was present more often in the cases than .n 
the controls (22% cases, 5% controls, p=0.0003). 

TABLE 19: TDT ANALYSIS OF SPECIFIC IgE PHENOTYPE 



Combined US and UK 



Exon 



Over- 
Allele 



454 HI 



454 H 2 
454 E 2 H 1 



454 E 2 H 2 



Transmitted TDT 



G 

CA 



TG 



p-value 



I 



ase/Control 
value 



Control 
Freq 



0.1555 



0.3757 



0.7101 



0.00006 



0.8317 



0.0006 



1 



22.1% 



0 3392 98.0% 



0 0332 49.5% 



ase 
Freq 



37.5% 



96.4% 



38.6% 



4M H 1 H 2 
454 E 2 H 1 H 2|CAG 



154 H 1 
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0.1369 
0.6602 



0.001 : 
0.001: 



20.0% 
15.3% 



Over-Transmitted TDT 
Genotype \ p-value 



Case/Control 
p-value 



33.9% 
29.3% 



Control 
Freq 



154 H 2 



AA_ 

Eg" 



154 E 2 H 1 



CATTA 



454 E 2 H 2 



|454 H 1 H 2 



TG/TG 



0.0910 



0.3740 



0.2586 



0.7369 



AG/AG 



454 E 2 H 1 H 2 CAG/CAG 
4S4 E 2 H 1 H 2ICGA/CGA 



0.1104 



0.3841 



0.3841 



0.00003 



0.3340 



0.0314 



Case 
Freq 



4.9% 



96.0% 



2.5% 



0.0118 



0.00003 



0.0004 



1 .0000 



26.3% 



2.0% 



1.0% 



0.0% 



22.1% 



92.9% 



8.8% 



1 1 .4% 



17.7% 



11.8% 



0.0% 



fvampi P 1A: GENE ANAI YSIS AND POTENTIAL FUNCTION 

L Functional RoIp of Kene 454 i n Asthma and Related Diseases 

Extracellular ATP triggers a variety of responses in several cell types. 
5 including contraction of smooth muscles, regulation of nitric oxide production 
from endothelium, stimulation of cytokine release from immune cells, and 
modulation of several other metabolic pathways. The receptors that 
mediate these diverse effects are the P2 purinoreceptors, which are divided 
into two subgroups: P2Y and P2X receptors. The P2X receptors are a 
10 family of multimeric ligand-gated ion channels activated solely by 
extracellular ATP and structurally distinct from other ligand-gated channels. 

Gene 454 represents the seventh member of the P2X receptor family. 
P2X7. The nucleic acid sequence of Gene 454 corresponds to SEQ ID 
NO- 19 and the encoded amino acid sequence corresponds to SEQ ID 
15 NO-1 1 1 as disclosed herein (see Figures 7A-7H). The Gene 454 transcript 
is 5.087 Kb, the gene is -55 Kb in size, and includes 13 exons. The Gene 
454 ORF is 1788 bp long and encodes a 596 amino acid protein. The 5' 
and 3' untranslated regions are 69 bp and 3230 bp in length, respectively. 
As determined by the experiments described herein, Gene 454 is expressed 
20 in brain, heart, skeletal muscle, spleen, kidney, liver, placenta, lung, 
leukocytes, lymph and fetal liver tissues (Figure 6). 

Data have indicated that the P2X7 receptor is involve-in cell death, 
cytokine release, and the shedding of surface antigens. The P2X7 receptor 
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also mediates activation of the transcription factors NF-K-beta and NFAT 
The P2X7 receptor displays unique permeability properties. At low ATP 
concentrations P2X7 forms small ATP-gated cation channels, allowing the 
influx of small cations, including Ca 2 *, into the intracellular environment. 
5 Notably, in rat peritoneal mast cells, there is a direct correlation between the 
influx of Ca 2 * and the release of histamine as a consequence of ATP levels 
(Schulman et al.. 1999, An. J- Respir. Cell Mo/. Biol. 20:530-537). In 
addition, at these levels of ATP, various proteases are activated including 
membrane metalloproteases and intracellular caspases (Gu et al.. 1998. 

10 Btood 92:946-951). 

At high ATP concentrations, the P2X7 receptor pore size increases 
allowing the passage of anions as well as cations up to 900 daltons in size 
(Nihei et al., 2000. Mem. ML Oswa/do Cruz 95:415-428). Interestingly, 
inhalation of aerosolized ATP has been shown to trigger bronchoconstncflon 
in healthy and asthmatic individuals. In asthmatics. ATP was 50 times more 
potent than methacholine. and 87-fold more potent than histamine, in 
producing a 15% decrease in FEV,. (Schulman et al., 1999, Am. J. Resp.r. 
Cell Mo/ 8/0/. 20:530-537). This suggests that extracellular ATP acts as an 
important modulator of pro-inflammatory regulation via the P2X7 receptor. 

The P2X7 protein contains two transmembrane domains connected 
by a large extracellular loop, and intracellular N-terminal and C-terminal 
domains (Figure 10). P2X7 shares signrflcant amino acid identity wrth the 
other members of the P2X receptor family (30^0%), except in the C- 
terminus domain, which is 240 amino acids long. P2X7 contains a long 
25 unique carboxyl terminus, which appears to be involved in the permeability 
properties of the P2X7 receptor. Truncation of the cytoplasmic tail abolishes 
ATP-induced uptake of the fluorescent dye YoPro-1 and ethidium bromide 
(Gu et a... 2001. JBC 276:11135-11142). Further, a SNP (A^C) In the 
cytoplasmic tail was identified in the Caucasian population. The SNP results 
30 in a glutamic acid to alanine change at amino acid 496. This amino aod 
substitution results in a loss of functional P2X7 in homozygotes, and results 
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a 50% loss of function in heterozygotes (Gu et al„ 2001. JSC 276:11135- 
11142). The expression of P2X7 has been observed mainly in cells of the 
immune and hematopoietic system, and P2X7 has been shown to mediate 
the ATP-induced apoptotic death in monocytes, macrophages, and 
lymphocytes. However. P2X7 expression has been observed in other cell- 
types at lower levels. In particular, fibroblasts express P2X7, and are 

responsive to ATP. 

Fibroblasts are non-excitable cells that play a role in the modulation 
of a variety of microenvironmental situations to which these cells are 
exposed. In the lung, fibroblasts lie in the lamina propria under the 
basement membrane. The bronchial epithelium lies above the basement 
membrane, and is attached thereto. In accordance with one model of 
respiratory diseases, allergens cause the cells of the bronchial epithelium to 
release their cytoplasmic contents. The cellular ATP concentration of each 
cell is estimated to be 5-10 mM. The released ATP immediately passes 
through the basement membrane by passive diffusion. This triggers the 
P2X7 receptors on the surface of the fibroblast cells to dilate, forming an 
open channel. The P2X7 receptors allow the influx of cations and anions up 
to 900 daltons. One of these ions triggers a signal transduction cascade 
that induces the final step in the post-translational processing of pro-IL-1fi, a 
multipotential inflammatory mediator (Solle et al., JBC 276:125-132). The 
mature IL-1B binds to receptors on target cells that elicit signaling cascades. 
This leads to the up-regulation of gene products such as matrix 
metalloproteases, cyclooxygenase-2. IL-6 and cellular adhesion molecules, 
which contribute to inflammation. 

IL-6 is an important pro-inflammatory cytokine that is secreted by 
mononuclear phagocytes, antigen-presenting cells, and fibroblasts. In 
accordance with the current knowledge in the art. secretion of IL-6 creates a 
pro-inflammatory microenvironment that induces the release of other factors 
such as growth factors, cytokines, and prostaglandins. This, in turn, 
enhances the stimulation and propagation of fibroblasts, and leads to an 



increase in me release of pro-in.,amma.ory mo,ecu,es. Fibrob.as«s also play 
a ro ,e in exuding extraocular matrix. Notably, in asthmatics, the basemen, 
nTmbrane is tier than in norma, individuals due to tne abnorma, repa,r of 
I broncbia, epitbeiium by f.brobiasts. Further, my—sts are ais^n 
; abundance in astbmatic individuais. due in part to P— ry 

^The Gene 454 SNPs (Tabie 10; Figures 7A-7H. and 10) identffied by 
the experiments described herein resuit in nucieotide cbang.es ma. may 
disrupt the intracellular function, stability, splicing, or express,on o. the 
0 encoded protein. It is possibie that ,e ^ ~ -« 
increase or decrease in tbe normal activities or leve.s the P2X7 recep . 
tnereby affecting .be pro-inf.amma.ory response .nggered by ATP and 
resulting in as.bma.ic symp.oms. Tbe sum of .bese da.a ,nd,c,.es that 
Tne 454 (P2X7) is involved in .be pathophysiology of respiratory disorders. 
1 5 including asthma. 

Z^^^^^ M * n ' a scaffo ' d p 0 

RlMBP2 protein binds to RIM, a putative elector 

20 protein that contains an SH3 domain, 

Ling .0 RIM. RIMBP2 also contains .broneCin ^^C^Zl 
rarely observed in in.racellular proteins (Wang et al.. 2000, JBC 275.2 

2044) ' The nucleotide sepuence o, Gene 561 , alternate splice vanan. (also 
25 referred to as 561.nt1), corresponds to SEQ ID NO:31 (Figures 27A-27K . 
25 n b! encoded amino acid seguence (also referred .o as .Gene .1.1 
corresponds .o SBO ID NO,20. The ^J^Z^** «> 
a ,e - IcJseguenc (also 

NO:32 (Figures 28A-28C), ana me As 

30 — * 35 r — — ereinX Iscrtpt s*e of 
determined by the experiments descnoeu 
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Gene561.nt1 and Gene561.nt2 is 7.9 and 6.1 Kb, respectively (Figure 8). 
An alternative splice site has been identified at a position in the 3'UTR. 
between exons J and I (Figure 8). RT-PCR data indicate that Gene 561. nt1 
is clearly expressed in lung at low levels, but Gene561.nt2 is not (Figure 8). 
The genomic structure of Gene 561 comprises 21 exons and spans -200 
Kb. 

ATP has been shown to stimulate vagal afferent nerve terminals in 
the lung. This can lead to local axon and central vagal reflexes, which are 
known to play a major role in neurogenic inflammation and 
bronchoconstriction. Nocturnal asthma characterized by acute 
bronchoconstriction in the morning has been associated with platelet 
activation, which releases large amounts of ATP. and augmentation of vagal 
tone (Schulman et al.. 1999. Am. J. Respir. Cell Mo/. Biol. 20:530-537). It is 
possible that Gene 561 recruits synaptic vesicles for neurotransmitter 
15 release at the afferent nerve terminals in lung. This, in turn, may be 
important for bronchoconstriction/dilation. Accordingly, the Gene 561 SNPs 
that show association with asthma (Table 10. Figures 27A-27K. and Figures 
28A-28C) may disrupt the function, stability, or expression of the encoded 
protein. The altered Gene 561 protein may cause an increase or decrease 
20 of neurotransmitter, resulting in augmentation of the vagal tone, and leading 
to bronchoconstriction. The sum of these data indicates that Gene 561 .s 
involved in the pathophysiology of respiratory disorders including asthma. 
3 . p. .national Role o< 757 in Asthma and Related Diseases 

Immunochemical studies have shown that both TGF-B 
(transforming growth factor ft) and EGFR1 (epidermal growth factor 
receptor) are highly expressed in areas of bronchial epithelial injury, and that 
these parallel pathways operate to repair epithelial cells (Puddicombe et al.. 
2000 FASEB J. 14:1362-1374). EGFR1 stimulates epithelial repa.r, wh.le 
TGF-fS regulates the production of prof.brogenic growth factors and 
30 proinflammatory cytokines leading to extracellular matrix synthesis. TGF-B 
also acts in the WNT signaling pathway, which functions in a vanety 
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developmental processes, including cell differentiation, cell polarity, cell 
migration, and cell proliferation (Calvo et al.. 2000. PNAS 97:12776-12781). 
The WNT components activate the frizzled receptors, which stabilize G- 
catenin. This, in turn, activates the expression of target genes in the 
5 nucleus (Kiihl et al., 2000, TIG 16:279-283). 

Gene 757 is frizzled 10 (FZD10), a putative receptor for Wnt-7a 
(Kawakami et al.. 2000. Develop. Growth Differ. 42:561-569). The nucleic 
acid sequence of Gene 757 corresponds to SEQ ID NO: 90, and the 
encoded amino acid sequence corresponds to SEQ ID NO: 153 (Figures 
10 9A-9F). As determined by the experiments described herein. Gene 757 is 
expressed in brain, heart, skeletal muscle, colon, thymus, spleen, kidney, 
small intestine, placenta, and lung (Figure 6). The transcript size of Gene 
757 is 3.6 Kb, of which 3253 bp have been identified (Figure 6). The 
transcript is contiguous with genomic DNA, indicating that Gene 757 is an 
15 intronless gene. The Gene 757 ORF is 1746 bp long and encodes a 581 
amino acid protein. The 3" untranslated region is 1052 bp long, and 456 bp 
of the 5' UTR has been sequenced. 

The FZD10 protein is a receptor composed of a seven- 
transmembrane repeat with an N-terminal cysteine-rich domain and a C- 
20 terminal Ser/Thr-XXX-Val motif. FZD10 shares 65.7% overall amino acid 
identity with FZD9 (Koike et al., 1999, Biochem. Biophys. Res. Commun. 
262:39-43). Frizzled 10 is a cell surface receptor for the secreted 
glycoprotein Wnt-7a. In accordance with one model of respiratory diseases, 
the WNT signaling gene acts in concert with the frizzled 10 receptor to 
25 trigger a signal transduction pathway leading to the activation of genes 
involved in bronchial epithelial repair. Thus. Gene 757 SNPs that are 
associated with the asthma phenotype (Table 10 and Figures 9A-9F) may 
alter the signal transduction pathway, causing either the over- or 
underexpression of genes involved in bronchial epithelium repair. This 
30 alteration, in turn, may result in the activation of the epithelial-mesenchymal 
trophic unit in the lung, placing the bronchial epithelium in a "state of repair 
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mode, and leading to airway remodeling (Holgate et al., 1999, Clin. Exp. 
Allergy. Suppl2:90-95). The sum- of these data indicate that Gene 757 
(FZD10) is directly involved in the pathophysiology of respiratory disorders 
including asthma. 

5 EXAMPLE 15: PROTEIN EXPRESSION AND PURIFICATION 

Expression and purification of the chromosome 12q23-qter proteins 
of the invention can be performed essentially as follows. Nucleotide 
sequences (e.g., one or more of SEQ ID NO:1 to SEQ ID NO:92 and SEQ 
ID NO:156 to SEQ ID NO:4684) are prepared by polymerase chain reaction 
10 (PCR). Synthetic oligonucleotide primers specific for the 5* and 3' ends of 
the nucleotide sequences are designed and purchased from Life 
Technologies (Gaithersburg, MD). All forward primers (specific for the 5' 
end of the sequence) are designed to include an Nco\ cloning site at the 5' 
terminus. These primers are designed to permit initiation of protein 
15 translation at the methionine residue encoded within the Ncol site followed 
by a valine residue and the protein encoded by the nucleotide sequence. All 
reverse primers (specific for the 3' end of the sequence) include an EcoRI 
site at the 5' terminus to permit cloning of the sequence into the reading 
frame of the pET-28b expression vector (Novagen). The pET-28b vector 
20 provides a sequence encoding an additional 20 carboxyl-terminal amino 
acids including six histidine residues, which comprise the His-Tap affinity 
tag. 

Genomic DNA prepared from the 12q23-qter including the BAC 
sequences including RPCI-11.0899A17, RPCI-11_0666B20, RPCI- 

25 11_0723P10, RPCI-11_0831E1b, RPCI-1 1.0932D22 and RPCI- 
11_0702C13 (SEQ ID N0.719 to SEQ ID NO:978; Table 3A) and BAC end 
sequence (SEQ ID NO:156 to SEQ ID NO:693) region is used as the 
template for PCR amplification (Ausubel et al. 1994). For PCR amplification, 
cDNA (50 ng) is introduced into a reaction vial containing 2 mM MgCI 2 , 1 

30 synthetic primers (forward and reverse primers complementary to and 
flanking a defined 12q23-qter region), 0.2 mM of each of dNTP (dATP, 



239 

dGTP, dCTP, and dTTP), and 2.5 U heat stable DNA polymerase (Amplitaq, 
Roche Molecular Systems, Inc., Branchburg, NJ) in a final volume of 100 pi. 

Upon completion of thermal cycling reactions, each sample of 
amplified DNA is purified using the Qiaquick Spin PCR purification kit 
5 (QIAGEN, Gaithersburg, MD). PCR products are subjected to digestion with 
the restriction endonucleases, e.g., Ncol and EcoRI (New England BioLabs, 
Beverly, MA) (Ausubel et al, 1994). The digested DNA is subjected to 
electrophoresis on 1.0% NuSeive (FMC BioProducts, Rockland, ME) 
agarose gels. The gel is incubated with ethidium bromide, and the digested 
10 DNA is visualized with long-wave UV irradiation. The DNA fragments are 
isolated from the agarose gel, and are purified using the GeneClean Kit 
protocol (BIO 101, Vista, CA). 

The pET-28b vector is prepared for cloning by digestion with 
restriction endonucleases, e.g., Nco\ and EcoRI (New England BioLabs, 
15 Beverly, MA) (Ausubel et al, 1994). The digested pET-28b expression 
vector is ligated to the gel-isolated DNA fragments (Ausubel et al.. 1994). 
The ligated product is used to transform E. coli (e.g., BL21) (Ausubel et al, 
1994) as follows. Briefly, 1 pi of ligation reaction is mixed with 50 pi of 
electrocompetent BL21 cells, and the cells are subjected to a high voltage 
20 pulse. Following this, cells are incubated in 0.45 ml SOC medium (0.5% 
yeast extract, 2.0% tryptone, 10 mM NaCI, 2.5 mM KCI, 10 mM MgCI 2 , 10 
mM MgSCu, and 20 mM glucose) at 37°C with shaking for 1 hr. Cells are 
then spread on LB agar plates containing 25 pg/ml kanamycin sulfate, and 
grown overnight. Transformant BL21 colonies are then isolated and 
25 analyzed to evaluate cloned inserts, as describsd below. 

Individual BL21 tranformant colonies are analyzed by PCR 
amplification. The PCR reaction uses the same forward and reverse 
primers specific for the 12q23-qter region sequences that are used in the 
cloning step. Successful amplification verifies the ligation of the sequence in 
30 the expression vector (Ausubel et al., 1994). Individual BL21 colonies 
containing pET-28b vectors with 12q23-qter region nucleotide sequences 
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are inoculated into 5 ml of LB broth plus 25 ug/ml kanamycin sulfate, and 
grown overnight. The following day, plasmid DNA is isolated and purified 
using the QIAGEN plasmid purification protocol (QIAGEN Inc., Chatswortlr, 
CA). 

5 The pET vector can be propagated in any E. coli K-12 strain, e.g., 

HMS174, HB101, JM109, DH5, and the like, for purposes of cloning or 
plasmid preparation. Hosts for expression include E. coli strains containing 
a chromosomal copy of the gene for T7 RNA polymerase. These hosts are 
lysogens of bacteriophage DE3, a lambda derivative that carries the lad 
10 gene, the lacUV5 promoter, and the gene for T7 RNA polymerase. T7 RNA 

polymerase is induced by addition of isopropyl-B-D-thiogalactoside (IPTG). 

and the T7 RNA polymerase transcribes any target plasmid containing a 

functional T7 promoter, such as pET-28b, carrying its gene of interest. 

Strains include, for example, BL21(DE3) (Studier et al.. 1990, Meth. 

15 EnzymoL, 185:60-89). 

To express the recombinant sequence, 50 ng of plasmid DNA are 
isolated as described above to transform competent BL21(DE3) bacteria as 
described above (provided by Novagen as part of the pET expression kit). 
The lacZ gene (p-galactosidase) is expressed in the pET-System as 
20 described for the 12q23-qter region recombinant constructions. 
Transformed cells are grown in SOC medium for 1 hr. and then plated on LB 
plates containing 25 ug/ml kanamycin sulfate. The following day. the 
colonies are pooled and grown in LB medium containing kanamycin sulfate 
(25 ug/ml) to an optical density at 600 nM of 0.5 to 1.0 OD units. At that 
25 point, 1 mM IPTG is added to th* culture for 3 hr to induce gene express.on 
of the 12q23-qter sequences. 

After induction of gene expression with IPTG, cells are collected by 
centrifugation in a Sorvall RC-3B centrifuge at 3500 x g for 15 min at 4°C. 
Pellets are resuspended in 50 ml of cold mM Tris-HCI. P H 8.0, 0.1 M NaCI. 
30 and 0.1 mM EDTA (STE buffer). Cells are then centrifuged at 2000 x g for 
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20 minutes at 4°C. Wet pellets are weighed and frozen at -80°C until ready 
for protein purification. 

The disclosure of each of the patents, patent applications, and 
publications cited in the specification is hereby incorporated by reference 
herein in its entirety. 

Although the invention has been set forth in detail, one skilled in the 
art will recognize that numerous changes and modifications can be made, 
and that such changes and modifications may be made without departing 
from the spirit and scope of the invention. 



